SlideShare a Scribd company logo
1 of 45
Download to read offline
Sharding and things
We’d like to see
improved
© Pythian Services Inc 2022 | Confidential|
Igor Donchovski
May-17th 2022
© Pythian Services Inc 2022 | Confidential | 2
About me
Igor Donchovski
Principal Consultant
Pythian - OSDB
© Pythian Services Inc 2022 | Confidential | 3
25
Years in Business
450+
Experts across every Data Domain & Technology
400+
Global Customers
_______________
Gold Partner
40+ Certifications
5 Competencies,
Incl. Data Analytics
and Data Platform
_______________
Silver Partner
50+ Certifications
Migration Factory
Certified Apps
Hosting
_______________
Advanced Partner
1O0+ Certs/Accreds
Migration & DevOps
Competencies
Pythian overview
________________
Premier Partner
120+ Certs/Creds
6 Specializations, MSP
Badge, Partner &
Technical Advisory Bds
Pythian maximizes the value of your data estate by delivering advanced on-prem, hybrid, cloud, and
multi-cloud solutions and solving your toughest data and analytics challenges.
_______________
Select Partner
10+ Certs/Accreds
_______________
Platinum Partner
60+ Certifications
Advisory Board
Member
© Pythian Services Inc 2022 | Confidential | 4
Overview
• Scaling (Vertical and Horizontal)
• What is a sharded cluster in MongoDB
• Cluster components - shards, config servers, mongos
• Shard keys and chunks
• Hashed and range based sharding
• Choosing a shard key
• Things we’d like to see improved
• QA
Scaling
© Pythian Services Inc 2022 | Confidential | 5
© Pythian Services Inc 2022 | Confidential | 6
Scaling
Time for scaling?
© Pythian Services Inc 2022 | Confidential | 6
App
© Pythian Services Inc 2022 | Confidential | 7
Scaling
Time for scaling?
• The CPU and/or memory becomes overloaded, and the database server either cannot
respond to all the request throughput or do so in a reasonable amount of time.
• Your database server runs out of storage, and thus cannot store all the data.
• Your network interface is overloaded, so it cannot support all the network traffic received.
© Pythian Services Inc 2022 | Confidential | 8
Scaling
• Vertical scaling
– Adding more power (CPU, RAM, DISK) to an
existing machine
– Might require downtime while scaling up
• Horizontal scaling
– Adding more machines into your pool of
resources
– Partitioning the data on multiple machines
– Parallelizing the processing
– More complex to implement and maintain
© Pythian Services Inc 2022 | Confidential | 9
Vertical Scaling
1TB
2TB
8 vCPU
16G Mem
16 vCPU
32G Mem
© Pythian Services Inc 2022 | Confidential | 10
Horizontal Scaling - Reads
1TB
Primary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
DC1 DC2 DC3 DC4
votes:0
votes:0 votes:0
votes:0
votes:0
1TB
© Pythian Services Inc 2022 | Confidential | 11
Horizontal Scaling - Writes
1TB
256GB 256GB 256GB 256GB
1TB
Shard 1 Shard 2 Shard 3 Shard 4
Scaling with MongoDB
© Pythian Services Inc 2022 | Confidential | 12
Scaling per tenant
© Pythian Services Inc 2022 | Confidential | 13
App
Tenant A
Tenant B
Tenant C
Tenant DB
Standalone App
Scaling per tenant
© Pythian Services Inc 2022 | Confidential | 14
App
Tenant A
Tenant B
Tenant C
Tenant DB
App
A B C
Standalone App Database per Tenant
App
A
C
Sharded Multi-Tenant
Catalog
B
Catalog
Scaling per tenant
© Pythian Services Inc 2022 | Confidential | 15
App Catalog
Tenant A Tenant B Tenant C
Sharded Multi-Tenant
N
Scaling with MongoDB - Sharding
• Shard/Replica set
(subset of the sharded data)
• Config servers
(metadata and config settings)
• mongos
(query router, cluster interface)
mongos> sh.addShard("shardN")
© Pythian Services Inc 2022 | Confidential | 16
Scaling with MongoDB - Shards
• Contains subset of sharded data
• Replica set for redundancy and HA
• Primary shard (picked by mongos with
least amount of data)
• Non sharded collections
• --shardsvr in config file (port 27018)
© Pythian Services Inc 2022 | Confidential | 17
Scaling with MongoDB - Config Servers
• Stores the metadata for sharded cluster in
config database
• Authentication configuration information in
admin database
• Config servers as replica set only (>= 3.4)
• Holds balancer on Primary node (>= 3.4)
• --configsvr in config file (port 27019)
© Pythian Services Inc 2022 | Confidential | 18
Scaling with MongoDB - mongos
• Caching metadata from config servers
• Routes queries to shards
• No persistent state
• Updates cache on metadata changes
• Holds balancer (MongoDB <= 3.2)
• Starting in MongoDB 4.0, the mongos binary
will crash when attempting to connect to
mongod instances whose (fCV) is greater than
that of the mongos
© Pythian Services Inc 2022 | Confidential | 19
Monotonically changing
Frequency
Choosing shard key
Cardinality
© Pythian Services Inc 2022 | Confidential | 20
● Choose a key that is included in most of your queries
● Ideally you don’t want huge number of documents to share the same shard key
● Choose something that will co-locate data you wish to retrieve together
• A contiguous range of shard key values within a particular shard
• Chunk ranges are inclusive of the lower boundary and exclusive of the upper boundary
• Chunks split when they grow beyond the configured chunk size (default is 64MB).
Configurable between 1MB and 1GB
• MongoDB migrates chunks when a shard contains too many chunks of a collection
relative to other shards
Chunks
© Pythian Services Inc 2022 | Confidential | 21
Number of chunks Migration threshold
< 20 2
20 - 79 4
>= 80 8
Ranged based sharding
• Dividing data into contiguous ranges determined by the shard key values
• Documents with “close” shard key values are likely to be in the same chunk or shard
• Query Isolation - more likely to target single shard for range queries
© Pythian Services Inc 2022 | Confidential | 22
Hashed based sharding
• Uses a hashed index of a single or compound key to partition data
• More even data distribution at the cost of reducing Query Isolation
• Applications do not need to compute hashes
© Pythian Services Inc 2022 | Confidential | 23
Balancer
• Background process that monitors the number of chunks on each shard
• Migrates chunks between shards to reach an equal number of chunks per shard
• Runs on the Primary of the config servers replica set
© Pythian Services Inc 2022 | Confidential | 24
Shard1 Shard2 Shard3
Migrate
64MB 64MB 64MB
64MB
64MB 64MB
64MB 64MB
64MB
64MB 64MB
64MB
64MB
Balancer
• Background process that monitors the number of chunks on each shard
• Migrates chunks between shards to reach an equal number of chunks per shard
• Runs on the Primary of the config servers replica set
© Pythian Services Inc 2022 | Confidential | 25
Shard1 Shard2 Shard3
64MB 64MB
64MB
64MB
64MB 64MB
64MB
64MB
64MB 64MB
64MB
64MB
• Collection tracking that has documents for {client_id, asset_id, pushed_at…….}
• Each client has their assets tracked by GPS location
• Tens of thousands of clients
• Ranges between tens to hundreds of thousands of assets per client
• Millions of document inserts per minute on cluster level
• Data expires after 12 months, TTL index on pushed_at
• Clients request to get trajectories of asset_id[s] in time range
Sharding a collection
© Pythian Services Inc 2022 | Confidential | 26
• Shard key options
{client_id : 1}
{client_id : 1, asset_id : 1}
{client_id : 1, asset_id : 1, pushed_at : 1}
Sharding a collection
© Pythian Services Inc 2022 | Confidential | 27
Refining a shard key
© Pythian Services Inc 2022 | Confidential | 28
• MongoDB version >= 4.4
{client_id : 1}
db.adminCommand( { refineCollectionShardKey: "assets.tracking",
key: { client_id: 1, asset_id: 1 }} )
db.adminCommand( { refineCollectionShardKey: "assets.tracking",
key: { client_id: 1, asset_id: 1, pushed_at : 1 }} )
Resharding a collection
MongoDB version >=5.0
db.adminCommand({
reshardCollection: "assets.tracking",
key: { asset_id: 1, pushed_at : "hashed" }
})
© Pythian Services Inc 2022 | Confidential | 29
Things we’d like improved
© Pythian Services Inc 2022 | Confidential | 30
Balancing
● Goal state: Reach an equal number of chunks per shard
● Does not balance data and has no info how much actual data exists per chunk
© Pythian Services Inc 2022 | Confidential | 31
Shard1 Shard2 Shard3
64MB 64MB
64MB
64MB
64MB 64MB
64MB
64MB
64MB 64MB
64MB
64MB
Balancing
● Data expires, gets deleted over time (TTL)
● Empty chunks are being created
● There is no internal thread to clean empty chunks
© Pythian Services Inc 2022 | Confidential | 32
Shard1 Shard2 Shard3
50MB 64MB 32MB
40MB
48MB
5MB 5MB
10MB
Balancing
● Real data distribution
© Pythian Services Inc 2022 | Confidential | 33
Shard1 Shard2 Shard3
50MB 184MB 20MB
Balancing
Uneven distribution
© Pythian Services Inc 2022 | Confidential | 34
● Config database, chunks collection has no info on chunks size
mongos> db.chunks.findOne()
{ "_id" : "assets.tracking-client_id_1asset_id_3705937pushed_at_MinKey",
"lastmod" : Timestamp(562503, 1),
"lastmodEpoch" : ObjectId("5e2e7c6bcab375a6b995002d"),
"ns" : "assets.tracking",
"min" : {"client_id" : NumberLong(1), "asset_id" : NumberLong(3705937),"pushed_at" : { "$minKey" : 1 }},
"max" : {"client_id" : NumberLong(1),"asset_id" : NumberLong(3972512),"pushed_at" : { "$minKey" : 1 }},
"shard" : "s6",
"jumbo" : false,
"history" : [{"validAfter" : Timestamp(1642642009, 1312), "shard" : "s6"}]
}
Cluster metadata
© Pythian Services Inc 2022 | Confidential | 35
● You have to run separate script to get all chunks with their size
○ dataSize command returns the size in bytes for the specified data
● Requires scanning all of the chunks and documents associated with
● Merging empty chunks using the mergeChunks operation
● Rebalancing the chunks after the initial merge
○ moveChunk command to manually move
○ Let the balancer redistribute the chunks
Cluster metadata
© Pythian Services Inc 2022 | Confidential | 36
● Merging chunks that are continuous range on the same shard
Merging empty chunks
© Pythian Services Inc 2022 | Confidential | 37
Shard1 Shard2 Shard3
50MB 64MB 32MB
40MB
48MB
5MB 5MB
10MB
● Rebalancing only moves chunks that are not used - ‘hot’
Merging empty chunks
© Pythian Services Inc 2022 | Confidential | 38
Shard1 Shard2 Shard3
50MB 64MB 32MB
40MB
48MB
5MB 5MB
10MB
Migrate Migrate
Migrating chunks
1. The balancer process sends the moveChunk command to the source shard
2. The source starts the move with an internal moveChunk command. During the migration process,
operations to the chunk route to the source shard. The source shard is responsible for incoming write
operations for the chunk
3. The destination shard builds any indexes required by the source that do not exist on the destination
4. The destination shard begins requesting documents in the chunk and starts receiving copies of the data.
5. After receiving the final document in the chunk, the destination shard starts a synchronization process to
ensure that it has the changes to the migrated documents that occurred during the migration
6. When fully synchronized, the source shard connects to the config database and updates the
cluster metadata with the new location for the chunk
7. After the source shard completes the update of the metadata, and once there are no open cursors on the
chunk, the source shard deletes its copy of the documents
© Pythian Services Inc 2022 | Confidential | 39
Migrating chunks
● Collection level lock to update the cluster metadata
● Rebalancing only moves chunks that are not ‘hot’
● Perform this in low peak hours
© Pythian Services Inc 2022 | Confidential | 40
Observability
© Pythian Services Inc 2022 | Confidential | 41
Observability
© Pythian Services Inc 2022 | Confidential | 42
Observability
© Pythian Services Inc 2022 | Confidential | 43
mongostat -u<username> -p<password> --authenticationDatabase=admin --discover --all --interactive 5
host insert query update delete getmore command dirty used flushes mapped vsize res nonmapped faults lrw lrwt qrw arw net_in net_out conn set repl time
A1r2:27018 4 290 197 *0 1531 1082|0 2.9% 80.0% 0 69.0G 49.2G 69.0G n/a 0.0%|0.0% 0|0 0|0 1|0 2.96m 11.9m 711 s1 PRI May 10 11:48:55.927
A2r1:27018 8 205 146 *0 1244 885|0 0.7% 80.0% 0 65.8G 48.2G 65.8G n/a 0.0%|0.0% 0|0 0|0 1|0 2.42m 9.64m 529 s2 PRI May 10 11:48:55.934
A3r2:27018 10 158 106 *0 986 694|0 0.8% 80.0% 0 68.1G 48.7G 68.1G n/a 0.0%|0.0% 0|0 0|0 1|0 1.77m 6.18m 476 s3 PRI May 10 11:48:55.913
A4r1:27018 4 203 143 *0 1195 793|0 2.0% 80.0% 0 70.0G 48.7G 70.0G n/a 0.0%|0.0% 0|0 0|0 1|0 2.13m 7.88m 499 s4 PRI May 10 11:48:55.945
A5r1:27018 3 174 110 *0 964 693|0 1.5% 80.0% 0 68.1G 49.1G 68.1G n/a 0.0%|0.0% 0|0 0|0 1|0 1.78m 6.40m 487 s5 PRI May 10 11:48:55.946
A6r3:27018 3 216 134 *0 1114 815|0 1.1% 80.0% 0 68.6G 48.8G 68.6G n/a 0.0%|0.0% 0|0 0|0 1|0 2.16m 10.7m 461 s6 PRI May 10 11:48:55.941
A7r3:27018 4 192 130 *0 680 456|0 1.4% 80.0% 0 66.7G 49.0G 66.7G n/a 0.0%|0.0% 0|0 0|0 1|0 1.69m 7.22m 472 s7 PRI May 10 11:48:55.955
A8r1:27018 2 265 155 *0 1817 1116|0 2.8% 80.0% 0 67.2G 48.8G 67.2G n/a 0.0%|0.0% 0|0 0|2 1|0 2.79m 10.8m 498 s8 PRI May 10 11:48:55.930
A9r3:27018 2 146 88 *0 712 555|0 1.2% 80.0% 0 65.3G 48.9G 65.3G n/a 0.0%|0.0% 0|0 0|0 1|0 1.63m 7.95m 454 s9 PRI May 10 11:48:55.942
mongostat -u<username> -p<password> --authenticationDatabase=admin --discover
-o="host,connections.current=connectionsCurrent,metrics.document.returned.rate()=documentsReturned,metrics.operation.scanAndOrder.rate()=scanAndOrder,metrics.queryExecutor.scanned.rate()=
indexScans,metrics.queryExecutor.scannedObjects.rate()=documentScans,localTime" --interactive 5
host connectionsCurrent documentsReturned scanAndOrder indexScans documentScans localTime
A1r2:27018 711 2378 1 3478 3442 2022-05-10 12:03:06.847 +0000 UTC
A2r1:27018 533 1457 2 719 1639 2022-05-10 12:03:06.88 +0000 UTC
A3r2:27018 472 945 1 651 1132 2022-05-10 12:03:06.846 +0000 UTC
A4r1:27018 498 1344 2 1835 2557 2022-05-10 12:03:06.861 +0000 UTC
A5r1:27018 482 1097 3 3510 3674 2022-05-10 12:03:06.897 +0000 UTC
A6r3:27018 463 1146 1 1123 1457 2022-05-10 12:03:06.903 +0000 UTC
A7r3:27018 478 927 1 1034 1313 2022-05-10 12:03:06.908 +0000 UTC
A8r1:27018 498 1539 1 979 1682 2022-05-10 12:03:06.859 +0000 UTC
A9r3:27018 455 995 1 7988 1985 2022-05-10 12:03:06.897 +0000 UTC
● The balancer only cares if the cluster has equal number of chunks per shard
● Migrating a chunk from source to target shard requires collection level lock
● Each chunk moved is being written 3 times (write on 1, write on 2, delete on 1)
● Choose a shard key that will avoid broadcast operations on the cluster
● Hashed based sharding usually requires more resources compared to range based
● Choose a shard key that is included in most of your queries and with high cardinality
● Scaling out might take days or weeks before the data gets redistributed
● Only use sharding for your large collections
● Avoid sharding if you can manage your data in replica sets
Summary
© Pythian Services Inc 2022 | Confidential | 44
Thank you!
© Pythian Services Inc 2022 | Confidential|

More Related Content

Similar to Sharding and things we'd like to see improved

Trisilco-IT PDLS (red)
Trisilco-IT PDLS (red)Trisilco-IT PDLS (red)
Trisilco-IT PDLS (red)Hasan Mokaddes
 
b04-DataflowArchitecture.pdf
b04-DataflowArchitecture.pdfb04-DataflowArchitecture.pdf
b04-DataflowArchitecture.pdfRAJA RAY
 
Relational cloud, A Database-as-a-Service for the Cloud
Relational cloud, A Database-as-a-Service for the CloudRelational cloud, A Database-as-a-Service for the Cloud
Relational cloud, A Database-as-a-Service for the CloudHossein Riasati
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014Dylan Tong
 
Implementing and Troubleshooting PVS
Implementing and Troubleshooting PVSImplementing and Troubleshooting PVS
Implementing and Troubleshooting PVSDavid McGeough
 
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...Riccardo Zamana
 
IEEE 2014 DOTNET DATA MINING PROJECTS Trusted db a-trusted-hardware-based-dat...
IEEE 2014 DOTNET DATA MINING PROJECTS Trusted db a-trusted-hardware-based-dat...IEEE 2014 DOTNET DATA MINING PROJECTS Trusted db a-trusted-hardware-based-dat...
IEEE 2014 DOTNET DATA MINING PROJECTS Trusted db a-trusted-hardware-based-dat...IEEEMEMTECHSTUDENTPROJECTS
 
2014 IEEE DOTNET DATA MINING PROJECT Trusteddb a-trusted-hardware-based-datab...
2014 IEEE DOTNET DATA MINING PROJECT Trusteddb a-trusted-hardware-based-datab...2014 IEEE DOTNET DATA MINING PROJECT Trusteddb a-trusted-hardware-based-datab...
2014 IEEE DOTNET DATA MINING PROJECT Trusteddb a-trusted-hardware-based-datab...IEEEMEMTECHSTUDENTSPROJECTS
 
The Current And Future State Of Service Mesh
The Current And Future State Of Service MeshThe Current And Future State Of Service Mesh
The Current And Future State Of Service MeshRam Vennam
 
20131028 BTUG.be - BizTalk Tracking
20131028 BTUG.be - BizTalk Tracking20131028 BTUG.be - BizTalk Tracking
20131028 BTUG.be - BizTalk TrackingBTUGbe
 
Real-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
Real-Time Streaming: Move IMS Data to Your Cloud Data WarehouseReal-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
Real-Time Streaming: Move IMS Data to Your Cloud Data WarehousePrecisely
 
Optimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource ConfigurationOptimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource ConfigurationRECAP Project
 
IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0Matt Lucas
 
Cf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphiteCf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphiteJeff Barrows
 
Globus Endpoint Migration and Advanced Administration Topics
Globus Endpoint Migration and Advanced Administration TopicsGlobus Endpoint Migration and Advanced Administration Topics
Globus Endpoint Migration and Advanced Administration TopicsGlobus
 
SPS Ozarks 2012: Kerberos Survival Guide
SPS Ozarks 2012: Kerberos Survival GuideSPS Ozarks 2012: Kerberos Survival Guide
SPS Ozarks 2012: Kerberos Survival GuideJ.D. Wade
 
Brocade Software Networking Presentation at Interface 2016
Brocade Software Networking Presentation at Interface 2016Brocade Software Networking Presentation at Interface 2016
Brocade Software Networking Presentation at Interface 2016Scott Sims
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWSSungmin Kim
 
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSEDB
 

Similar to Sharding and things we'd like to see improved (20)

Trisilco-IT PDLS (red)
Trisilco-IT PDLS (red)Trisilco-IT PDLS (red)
Trisilco-IT PDLS (red)
 
b04-DataflowArchitecture.pdf
b04-DataflowArchitecture.pdfb04-DataflowArchitecture.pdf
b04-DataflowArchitecture.pdf
 
Relational cloud, A Database-as-a-Service for the Cloud
Relational cloud, A Database-as-a-Service for the CloudRelational cloud, A Database-as-a-Service for the Cloud
Relational cloud, A Database-as-a-Service for the Cloud
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
Implementing and Troubleshooting PVS
Implementing and Troubleshooting PVSImplementing and Troubleshooting PVS
Implementing and Troubleshooting PVS
 
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
Time series Analytics - a deep dive into ADX Azure Data Explorer @Data Saturd...
 
IEEE 2014 DOTNET DATA MINING PROJECTS Trusted db a-trusted-hardware-based-dat...
IEEE 2014 DOTNET DATA MINING PROJECTS Trusted db a-trusted-hardware-based-dat...IEEE 2014 DOTNET DATA MINING PROJECTS Trusted db a-trusted-hardware-based-dat...
IEEE 2014 DOTNET DATA MINING PROJECTS Trusted db a-trusted-hardware-based-dat...
 
2014 IEEE DOTNET DATA MINING PROJECT Trusteddb a-trusted-hardware-based-datab...
2014 IEEE DOTNET DATA MINING PROJECT Trusteddb a-trusted-hardware-based-datab...2014 IEEE DOTNET DATA MINING PROJECT Trusteddb a-trusted-hardware-based-datab...
2014 IEEE DOTNET DATA MINING PROJECT Trusteddb a-trusted-hardware-based-datab...
 
The Current And Future State Of Service Mesh
The Current And Future State Of Service MeshThe Current And Future State Of Service Mesh
The Current And Future State Of Service Mesh
 
20131028 BTUG.be - BizTalk Tracking
20131028 BTUG.be - BizTalk Tracking20131028 BTUG.be - BizTalk Tracking
20131028 BTUG.be - BizTalk Tracking
 
Real-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
Real-Time Streaming: Move IMS Data to Your Cloud Data WarehouseReal-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
Real-Time Streaming: Move IMS Data to Your Cloud Data Warehouse
 
Optimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource ConfigurationOptimising Service Deployment and Infrastructure Resource Configuration
Optimising Service Deployment and Infrastructure Resource Configuration
 
IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0
 
Cf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphiteCf summit-2016-monitoring-cf-sensu-graphite
Cf summit-2016-monitoring-cf-sensu-graphite
 
Globus Endpoint Migration and Advanced Administration Topics
Globus Endpoint Migration and Advanced Administration TopicsGlobus Endpoint Migration and Advanced Administration Topics
Globus Endpoint Migration and Advanced Administration Topics
 
SPS Ozarks 2012: Kerberos Survival Guide
SPS Ozarks 2012: Kerberos Survival GuideSPS Ozarks 2012: Kerberos Survival Guide
SPS Ozarks 2012: Kerberos Survival Guide
 
Brocade Software Networking Presentation at Interface 2016
Brocade Software Networking Presentation at Interface 2016Brocade Software Networking Presentation at Interface 2016
Brocade Software Networking Presentation at Interface 2016
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Realtime Analytics on AWS
Realtime Analytics on AWSRealtime Analytics on AWS
Realtime Analytics on AWS
 
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 

More from Igor Donchovski

MongoDB Backups and PITR
MongoDB Backups and PITRMongoDB Backups and PITR
MongoDB Backups and PITRIgor Donchovski
 
Maintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica SetsMaintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica SetsIgor Donchovski
 
Exploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDBExploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDBIgor Donchovski
 
MongoDB HA - what can go wrong
MongoDB HA - what can go wrongMongoDB HA - what can go wrong
MongoDB HA - what can go wrongIgor Donchovski
 
Enhancing the default MongoDB Security
Enhancing the default MongoDB SecurityEnhancing the default MongoDB Security
Enhancing the default MongoDB SecurityIgor Donchovski
 
Exploring the replication in MongoDB
Exploring the replication in MongoDBExploring the replication in MongoDB
Exploring the replication in MongoDBIgor Donchovski
 
Working with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAWorking with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAIgor Donchovski
 

More from Igor Donchovski (8)

MongoDB Backups and PITR
MongoDB Backups and PITRMongoDB Backups and PITR
MongoDB Backups and PITR
 
Maintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica SetsMaintenance for MongoDB Replica Sets
Maintenance for MongoDB Replica Sets
 
Exploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDBExploring the replication and sharding in MongoDB
Exploring the replication and sharding in MongoDB
 
How to scale MongoDB
How to scale MongoDBHow to scale MongoDB
How to scale MongoDB
 
MongoDB HA - what can go wrong
MongoDB HA - what can go wrongMongoDB HA - what can go wrong
MongoDB HA - what can go wrong
 
Enhancing the default MongoDB Security
Enhancing the default MongoDB SecurityEnhancing the default MongoDB Security
Enhancing the default MongoDB Security
 
Exploring the replication in MongoDB
Exploring the replication in MongoDBExploring the replication in MongoDB
Exploring the replication in MongoDB
 
Working with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBAWorking with MongoDB as MySQL DBA
Working with MongoDB as MySQL DBA
 

Recently uploaded

Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknowmakika9823
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 

Recently uploaded (20)

Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service LucknowAminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
Aminabad Call Girl Agent 9548273370 , Call Girls Service Lucknow
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 

Sharding and things we'd like to see improved

  • 1. Sharding and things We’d like to see improved © Pythian Services Inc 2022 | Confidential| Igor Donchovski May-17th 2022
  • 2. © Pythian Services Inc 2022 | Confidential | 2 About me Igor Donchovski Principal Consultant Pythian - OSDB
  • 3. © Pythian Services Inc 2022 | Confidential | 3 25 Years in Business 450+ Experts across every Data Domain & Technology 400+ Global Customers _______________ Gold Partner 40+ Certifications 5 Competencies, Incl. Data Analytics and Data Platform _______________ Silver Partner 50+ Certifications Migration Factory Certified Apps Hosting _______________ Advanced Partner 1O0+ Certs/Accreds Migration & DevOps Competencies Pythian overview ________________ Premier Partner 120+ Certs/Creds 6 Specializations, MSP Badge, Partner & Technical Advisory Bds Pythian maximizes the value of your data estate by delivering advanced on-prem, hybrid, cloud, and multi-cloud solutions and solving your toughest data and analytics challenges. _______________ Select Partner 10+ Certs/Accreds _______________ Platinum Partner 60+ Certifications Advisory Board Member
  • 4. © Pythian Services Inc 2022 | Confidential | 4 Overview • Scaling (Vertical and Horizontal) • What is a sharded cluster in MongoDB • Cluster components - shards, config servers, mongos • Shard keys and chunks • Hashed and range based sharding • Choosing a shard key • Things we’d like to see improved • QA
  • 5. Scaling © Pythian Services Inc 2022 | Confidential | 5
  • 6. © Pythian Services Inc 2022 | Confidential | 6 Scaling Time for scaling? © Pythian Services Inc 2022 | Confidential | 6 App
  • 7. © Pythian Services Inc 2022 | Confidential | 7 Scaling Time for scaling? • The CPU and/or memory becomes overloaded, and the database server either cannot respond to all the request throughput or do so in a reasonable amount of time. • Your database server runs out of storage, and thus cannot store all the data. • Your network interface is overloaded, so it cannot support all the network traffic received.
  • 8. © Pythian Services Inc 2022 | Confidential | 8 Scaling • Vertical scaling – Adding more power (CPU, RAM, DISK) to an existing machine – Might require downtime while scaling up • Horizontal scaling – Adding more machines into your pool of resources – Partitioning the data on multiple machines – Parallelizing the processing – More complex to implement and maintain
  • 9. © Pythian Services Inc 2022 | Confidential | 9 Vertical Scaling 1TB 2TB 8 vCPU 16G Mem 16 vCPU 32G Mem
  • 10. © Pythian Services Inc 2022 | Confidential | 10 Horizontal Scaling - Reads 1TB Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary Secondary DC1 DC2 DC3 DC4 votes:0 votes:0 votes:0 votes:0 votes:0 1TB
  • 11. © Pythian Services Inc 2022 | Confidential | 11 Horizontal Scaling - Writes 1TB 256GB 256GB 256GB 256GB 1TB Shard 1 Shard 2 Shard 3 Shard 4
  • 12. Scaling with MongoDB © Pythian Services Inc 2022 | Confidential | 12
  • 13. Scaling per tenant © Pythian Services Inc 2022 | Confidential | 13 App Tenant A Tenant B Tenant C Tenant DB Standalone App
  • 14. Scaling per tenant © Pythian Services Inc 2022 | Confidential | 14 App Tenant A Tenant B Tenant C Tenant DB App A B C Standalone App Database per Tenant App A C Sharded Multi-Tenant Catalog B Catalog
  • 15. Scaling per tenant © Pythian Services Inc 2022 | Confidential | 15 App Catalog Tenant A Tenant B Tenant C Sharded Multi-Tenant N
  • 16. Scaling with MongoDB - Sharding • Shard/Replica set (subset of the sharded data) • Config servers (metadata and config settings) • mongos (query router, cluster interface) mongos> sh.addShard("shardN") © Pythian Services Inc 2022 | Confidential | 16
  • 17. Scaling with MongoDB - Shards • Contains subset of sharded data • Replica set for redundancy and HA • Primary shard (picked by mongos with least amount of data) • Non sharded collections • --shardsvr in config file (port 27018) © Pythian Services Inc 2022 | Confidential | 17
  • 18. Scaling with MongoDB - Config Servers • Stores the metadata for sharded cluster in config database • Authentication configuration information in admin database • Config servers as replica set only (>= 3.4) • Holds balancer on Primary node (>= 3.4) • --configsvr in config file (port 27019) © Pythian Services Inc 2022 | Confidential | 18
  • 19. Scaling with MongoDB - mongos • Caching metadata from config servers • Routes queries to shards • No persistent state • Updates cache on metadata changes • Holds balancer (MongoDB <= 3.2) • Starting in MongoDB 4.0, the mongos binary will crash when attempting to connect to mongod instances whose (fCV) is greater than that of the mongos © Pythian Services Inc 2022 | Confidential | 19
  • 20. Monotonically changing Frequency Choosing shard key Cardinality © Pythian Services Inc 2022 | Confidential | 20 ● Choose a key that is included in most of your queries ● Ideally you don’t want huge number of documents to share the same shard key ● Choose something that will co-locate data you wish to retrieve together
  • 21. • A contiguous range of shard key values within a particular shard • Chunk ranges are inclusive of the lower boundary and exclusive of the upper boundary • Chunks split when they grow beyond the configured chunk size (default is 64MB). Configurable between 1MB and 1GB • MongoDB migrates chunks when a shard contains too many chunks of a collection relative to other shards Chunks © Pythian Services Inc 2022 | Confidential | 21 Number of chunks Migration threshold < 20 2 20 - 79 4 >= 80 8
  • 22. Ranged based sharding • Dividing data into contiguous ranges determined by the shard key values • Documents with “close” shard key values are likely to be in the same chunk or shard • Query Isolation - more likely to target single shard for range queries © Pythian Services Inc 2022 | Confidential | 22
  • 23. Hashed based sharding • Uses a hashed index of a single or compound key to partition data • More even data distribution at the cost of reducing Query Isolation • Applications do not need to compute hashes © Pythian Services Inc 2022 | Confidential | 23
  • 24. Balancer • Background process that monitors the number of chunks on each shard • Migrates chunks between shards to reach an equal number of chunks per shard • Runs on the Primary of the config servers replica set © Pythian Services Inc 2022 | Confidential | 24 Shard1 Shard2 Shard3 Migrate 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB
  • 25. Balancer • Background process that monitors the number of chunks on each shard • Migrates chunks between shards to reach an equal number of chunks per shard • Runs on the Primary of the config servers replica set © Pythian Services Inc 2022 | Confidential | 25 Shard1 Shard2 Shard3 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB
  • 26. • Collection tracking that has documents for {client_id, asset_id, pushed_at…….} • Each client has their assets tracked by GPS location • Tens of thousands of clients • Ranges between tens to hundreds of thousands of assets per client • Millions of document inserts per minute on cluster level • Data expires after 12 months, TTL index on pushed_at • Clients request to get trajectories of asset_id[s] in time range Sharding a collection © Pythian Services Inc 2022 | Confidential | 26
  • 27. • Shard key options {client_id : 1} {client_id : 1, asset_id : 1} {client_id : 1, asset_id : 1, pushed_at : 1} Sharding a collection © Pythian Services Inc 2022 | Confidential | 27
  • 28. Refining a shard key © Pythian Services Inc 2022 | Confidential | 28 • MongoDB version >= 4.4 {client_id : 1} db.adminCommand( { refineCollectionShardKey: "assets.tracking", key: { client_id: 1, asset_id: 1 }} ) db.adminCommand( { refineCollectionShardKey: "assets.tracking", key: { client_id: 1, asset_id: 1, pushed_at : 1 }} )
  • 29. Resharding a collection MongoDB version >=5.0 db.adminCommand({ reshardCollection: "assets.tracking", key: { asset_id: 1, pushed_at : "hashed" } }) © Pythian Services Inc 2022 | Confidential | 29
  • 30. Things we’d like improved © Pythian Services Inc 2022 | Confidential | 30
  • 31. Balancing ● Goal state: Reach an equal number of chunks per shard ● Does not balance data and has no info how much actual data exists per chunk © Pythian Services Inc 2022 | Confidential | 31 Shard1 Shard2 Shard3 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB 64MB
  • 32. Balancing ● Data expires, gets deleted over time (TTL) ● Empty chunks are being created ● There is no internal thread to clean empty chunks © Pythian Services Inc 2022 | Confidential | 32 Shard1 Shard2 Shard3 50MB 64MB 32MB 40MB 48MB 5MB 5MB 10MB
  • 33. Balancing ● Real data distribution © Pythian Services Inc 2022 | Confidential | 33 Shard1 Shard2 Shard3 50MB 184MB 20MB
  • 34. Balancing Uneven distribution © Pythian Services Inc 2022 | Confidential | 34
  • 35. ● Config database, chunks collection has no info on chunks size mongos> db.chunks.findOne() { "_id" : "assets.tracking-client_id_1asset_id_3705937pushed_at_MinKey", "lastmod" : Timestamp(562503, 1), "lastmodEpoch" : ObjectId("5e2e7c6bcab375a6b995002d"), "ns" : "assets.tracking", "min" : {"client_id" : NumberLong(1), "asset_id" : NumberLong(3705937),"pushed_at" : { "$minKey" : 1 }}, "max" : {"client_id" : NumberLong(1),"asset_id" : NumberLong(3972512),"pushed_at" : { "$minKey" : 1 }}, "shard" : "s6", "jumbo" : false, "history" : [{"validAfter" : Timestamp(1642642009, 1312), "shard" : "s6"}] } Cluster metadata © Pythian Services Inc 2022 | Confidential | 35
  • 36. ● You have to run separate script to get all chunks with their size ○ dataSize command returns the size in bytes for the specified data ● Requires scanning all of the chunks and documents associated with ● Merging empty chunks using the mergeChunks operation ● Rebalancing the chunks after the initial merge ○ moveChunk command to manually move ○ Let the balancer redistribute the chunks Cluster metadata © Pythian Services Inc 2022 | Confidential | 36
  • 37. ● Merging chunks that are continuous range on the same shard Merging empty chunks © Pythian Services Inc 2022 | Confidential | 37 Shard1 Shard2 Shard3 50MB 64MB 32MB 40MB 48MB 5MB 5MB 10MB
  • 38. ● Rebalancing only moves chunks that are not used - ‘hot’ Merging empty chunks © Pythian Services Inc 2022 | Confidential | 38 Shard1 Shard2 Shard3 50MB 64MB 32MB 40MB 48MB 5MB 5MB 10MB Migrate Migrate
  • 39. Migrating chunks 1. The balancer process sends the moveChunk command to the source shard 2. The source starts the move with an internal moveChunk command. During the migration process, operations to the chunk route to the source shard. The source shard is responsible for incoming write operations for the chunk 3. The destination shard builds any indexes required by the source that do not exist on the destination 4. The destination shard begins requesting documents in the chunk and starts receiving copies of the data. 5. After receiving the final document in the chunk, the destination shard starts a synchronization process to ensure that it has the changes to the migrated documents that occurred during the migration 6. When fully synchronized, the source shard connects to the config database and updates the cluster metadata with the new location for the chunk 7. After the source shard completes the update of the metadata, and once there are no open cursors on the chunk, the source shard deletes its copy of the documents © Pythian Services Inc 2022 | Confidential | 39
  • 40. Migrating chunks ● Collection level lock to update the cluster metadata ● Rebalancing only moves chunks that are not ‘hot’ ● Perform this in low peak hours © Pythian Services Inc 2022 | Confidential | 40
  • 41. Observability © Pythian Services Inc 2022 | Confidential | 41
  • 42. Observability © Pythian Services Inc 2022 | Confidential | 42
  • 43. Observability © Pythian Services Inc 2022 | Confidential | 43 mongostat -u<username> -p<password> --authenticationDatabase=admin --discover --all --interactive 5 host insert query update delete getmore command dirty used flushes mapped vsize res nonmapped faults lrw lrwt qrw arw net_in net_out conn set repl time A1r2:27018 4 290 197 *0 1531 1082|0 2.9% 80.0% 0 69.0G 49.2G 69.0G n/a 0.0%|0.0% 0|0 0|0 1|0 2.96m 11.9m 711 s1 PRI May 10 11:48:55.927 A2r1:27018 8 205 146 *0 1244 885|0 0.7% 80.0% 0 65.8G 48.2G 65.8G n/a 0.0%|0.0% 0|0 0|0 1|0 2.42m 9.64m 529 s2 PRI May 10 11:48:55.934 A3r2:27018 10 158 106 *0 986 694|0 0.8% 80.0% 0 68.1G 48.7G 68.1G n/a 0.0%|0.0% 0|0 0|0 1|0 1.77m 6.18m 476 s3 PRI May 10 11:48:55.913 A4r1:27018 4 203 143 *0 1195 793|0 2.0% 80.0% 0 70.0G 48.7G 70.0G n/a 0.0%|0.0% 0|0 0|0 1|0 2.13m 7.88m 499 s4 PRI May 10 11:48:55.945 A5r1:27018 3 174 110 *0 964 693|0 1.5% 80.0% 0 68.1G 49.1G 68.1G n/a 0.0%|0.0% 0|0 0|0 1|0 1.78m 6.40m 487 s5 PRI May 10 11:48:55.946 A6r3:27018 3 216 134 *0 1114 815|0 1.1% 80.0% 0 68.6G 48.8G 68.6G n/a 0.0%|0.0% 0|0 0|0 1|0 2.16m 10.7m 461 s6 PRI May 10 11:48:55.941 A7r3:27018 4 192 130 *0 680 456|0 1.4% 80.0% 0 66.7G 49.0G 66.7G n/a 0.0%|0.0% 0|0 0|0 1|0 1.69m 7.22m 472 s7 PRI May 10 11:48:55.955 A8r1:27018 2 265 155 *0 1817 1116|0 2.8% 80.0% 0 67.2G 48.8G 67.2G n/a 0.0%|0.0% 0|0 0|2 1|0 2.79m 10.8m 498 s8 PRI May 10 11:48:55.930 A9r3:27018 2 146 88 *0 712 555|0 1.2% 80.0% 0 65.3G 48.9G 65.3G n/a 0.0%|0.0% 0|0 0|0 1|0 1.63m 7.95m 454 s9 PRI May 10 11:48:55.942 mongostat -u<username> -p<password> --authenticationDatabase=admin --discover -o="host,connections.current=connectionsCurrent,metrics.document.returned.rate()=documentsReturned,metrics.operation.scanAndOrder.rate()=scanAndOrder,metrics.queryExecutor.scanned.rate()= indexScans,metrics.queryExecutor.scannedObjects.rate()=documentScans,localTime" --interactive 5 host connectionsCurrent documentsReturned scanAndOrder indexScans documentScans localTime A1r2:27018 711 2378 1 3478 3442 2022-05-10 12:03:06.847 +0000 UTC A2r1:27018 533 1457 2 719 1639 2022-05-10 12:03:06.88 +0000 UTC A3r2:27018 472 945 1 651 1132 2022-05-10 12:03:06.846 +0000 UTC A4r1:27018 498 1344 2 1835 2557 2022-05-10 12:03:06.861 +0000 UTC A5r1:27018 482 1097 3 3510 3674 2022-05-10 12:03:06.897 +0000 UTC A6r3:27018 463 1146 1 1123 1457 2022-05-10 12:03:06.903 +0000 UTC A7r3:27018 478 927 1 1034 1313 2022-05-10 12:03:06.908 +0000 UTC A8r1:27018 498 1539 1 979 1682 2022-05-10 12:03:06.859 +0000 UTC A9r3:27018 455 995 1 7988 1985 2022-05-10 12:03:06.897 +0000 UTC
  • 44. ● The balancer only cares if the cluster has equal number of chunks per shard ● Migrating a chunk from source to target shard requires collection level lock ● Each chunk moved is being written 3 times (write on 1, write on 2, delete on 1) ● Choose a shard key that will avoid broadcast operations on the cluster ● Hashed based sharding usually requires more resources compared to range based ● Choose a shard key that is included in most of your queries and with high cardinality ● Scaling out might take days or weeks before the data gets redistributed ● Only use sharding for your large collections ● Avoid sharding if you can manage your data in replica sets Summary © Pythian Services Inc 2022 | Confidential | 44
  • 45. Thank you! © Pythian Services Inc 2022 | Confidential|