2. Why Sharding
• Vertical scaling is limited by physics and
cost
• Hard to scale vertically in the cloud
• Can scale wider than higher
3. What is Sharding
• ad-hoc partitioning
• consistent hashing (Dynamo)
• range based partitioning (BigTable/PNUTS)
4. Overview
• Automatic partitioning and management
• range based
• Can convert from single master to sharded
system with 0 downtime
• Almost no functionality lost over single
master
• Fully consistent
7. Config Servers
• 3 of them
• changes are made with 2 phase commit
• if any are down, meta data goes read only
• system is online as long as 1/3 is up
8. Shards
• Can be master, master/slave or replica sets
• Replica sets gives sharding + full auto-
failover
• Regular mongod processes
9. mongos
• Sharding Router
• Acts just like a mongod to clients
• Can have 1 or as many as you want
• Can run on appserver so no extra network
traffic
11. Queries
• By shard key: routed
• sorted by shard key: routed in order
• by non shard key: scatter gather
• sorted by non shard key: distributed merge
sort