SlideShare a Scribd company logo
1 of 18
Introduction to
MongoDB sharding
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
About me
• Product engineer at ServerDensity
• Working with mongoDB in production for more than 4 years
• Python and php programmer
• Pybcn co-organizer
• FOSDEM volunteer
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
What is sharding?
It’s the system MongoDB uses to:
• Distribute writes
• Distribute primary reads
• Distribute data
• Or, in other words, grow horizontally and scale
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
What does it look like?
• Like this:
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
What does it look like?
• Or like this:
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Nomenclature:
• Shard:
• Logical data partition
• Each shard is handled by a server or replica set
• Shard key:
• Key that all documents MUST have
• Decided by the user
• Chunk:
• Logical data partition inside a shard
• They be split into 2 smaller chunks
• They can be moved to another shard for balancing
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
What does each component do?
• Mongos processes route data
• Config servers hold metadata:
• What chunks are there
• What shard holds each chunk
• Which chunks are being migrated
• The shard servers hold the actual data
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
How does it work?
Whenever you read/write data this happens:
1. You run your query in your shell/driver
2. Your driver contacts the mongos process (a proxy)
3. The mongos process retrieves metadata from the config servers
4. Based on the metadata, asks the shards affected by the query to run
their part of the job
5. Mongos returns the result
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Data partitioning
Your data will be split in chunks based on your shard key:
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Choosing a good shard key
In order to get a good shard key it has to:
• Be used in ALL queries
• Allow a huge amount of possible values:
• Sha1 hash -> good
• Phone number -> not bad
• Zip code -> bad
• Boolean -> awful
• Have values evenly distributed across all the key space
If your shard key has a big cardinality, but it’s not evenly distributed
across the key space: use a hashed shard key
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Chunk partitioning
Whenever a chunk reaches certain size, the mongos process will try to
split it into two:
This will fail if all docs in this chunk belong to the same shard key value
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Balancing
• Inevitably, some shards will get more chunks than others
• The sharded cluster will automatically move chunks from crowded
shards to under-populated shards:
• It’s possible to start/stop and customize the balancing algorithm
• It’s possible to manually move chunks around
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
HA in a sharded cluster
In order to achieve HA in a sharded cluster you’ll need:
• 3 config servers:
• As long as 1 is up you’ll be able to read/write into the collection
• If a config server is down the metadata collection will be read-
only, so you won’t be able to:
• Split chunks
• Balance the cluster
• Add shards
• N shards; each one with, at least:
• 2 data bearing-nodes
• An arbiter or another data-bearing node
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Demo time!
Creating a new demo sharded cluster:
sudo service mongod stop
mkdir shard0
mkdir shard1
mkdir config
# Start the config server
mongod --fork --syslog --configsvr --dbpath config --port 27019
# Start the shard servers
mongod --fork --syslog --dbpath shard0 --port 30000
mongod --fork --syslog --dbpath shard1 --port 30001
# Start the mongos process
mongos --fork --syslog --configdb localhost:27019
# Add shards
mongo initSharding.js
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Demo time!
Creating a new demo sharded cluster:
//Creating shards
sh.addShard("localhost:30000");
sh.addShard("localhost:30001");
//Adding test data
for (i = 0; i < 10000; i++) {
db.testdata.insert({"i": i})
}
//Creating index
db.testdata.createIndex({"i": 1});
//Enabling sharding
sh.enableSharding("test")
sh.shardCollection("test.testdata", {i:1})
//Manually splitting chunks
for(i = 1; i < 20; i++) {
sh.splitAt("test.testdata", {"i": i*500})
}
//Status
print(sh.status(true));
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Questions?
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
We’re hiring!
We’re looking for awesome engineers!
Talk to me after the presentation or go to:
https://www.serverdensity.com/jobs/
Jordi Soucheiron - @jordixou
Barcelona MongoDB User Group – 2015-06-29
Code
https://github.com/jsoucheiron/mongodb-barcelona-sharding-introduction
Slides
http://www.slideshare.net/jordixou (soon)

More Related Content

Similar to Introduction to MongoDB sharding

MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceSasidhar Gogulapati
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionDaniel Coupal
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchMongoDB
 
Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: ShardingMongoDB
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012MongoDB
 
MongoDB Hacks of Frustration
MongoDB Hacks of FrustrationMongoDB Hacks of Frustration
MongoDB Hacks of FrustrationMongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to ShardingMongoDB
 
Sharding
ShardingSharding
ShardingMongoDB
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingMongoDB
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)MongoDB
 
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 20185 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018Matthew Groves
 
Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Mydbops
 

Similar to Introduction to MongoDB sharding (20)

MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
MongoDB San Francisco 2013: Basic Sharding in MongoDB presented by Brandon Bl...
 
MongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & PerformanceMongoDB : Scaling, Security & Performance
MongoDB : Scaling, Security & Performance
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in production
 
Basic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun VerchBasic Sharding in MongoDB presented by Shaun Verch
Basic Sharding in MongoDB presented by Shaun Verch
 
Webinar: Sharding
Webinar: ShardingWebinar: Sharding
Webinar: Sharding
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
Sharding - Seoul 2012
Sharding - Seoul 2012Sharding - Seoul 2012
Sharding - Seoul 2012
 
MongoDB Hacks of Frustration
MongoDB Hacks of FrustrationMongoDB Hacks of Frustration
MongoDB Hacks of Frustration
 
Sharding
ShardingSharding
Sharding
 
Sharding
ShardingSharding
Sharding
 
Introduction to Sharding
Introduction to ShardingIntroduction to Sharding
Introduction to Sharding
 
How to scale MongoDB
How to scale MongoDBHow to scale MongoDB
How to scale MongoDB
 
MongoDB by Tonny
MongoDB by TonnyMongoDB by Tonny
MongoDB by Tonny
 
Sharding
ShardingSharding
Sharding
 
Back to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to shardingBack to Basics Spanish 4 Introduction to sharding
Back to Basics Spanish 4 Introduction to sharding
 
Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)Sharding with MongoDB (Eliot Horowitz)
Sharding with MongoDB (Eliot Horowitz)
 
Tag based sharding presentation
Tag based sharding presentationTag based sharding presentation
Tag based sharding presentation
 
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 20185 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
 
Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding Scaling MongoDB with Horizontal and Vertical Sharding
Scaling MongoDB with Horizontal and Vertical Sharding
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

Introduction to MongoDB sharding

  • 1. Introduction to MongoDB sharding Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29
  • 2. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 About me • Product engineer at ServerDensity • Working with mongoDB in production for more than 4 years • Python and php programmer • Pybcn co-organizer • FOSDEM volunteer
  • 3. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 What is sharding? It’s the system MongoDB uses to: • Distribute writes • Distribute primary reads • Distribute data • Or, in other words, grow horizontally and scale
  • 4. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 What does it look like? • Like this:
  • 5. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 What does it look like? • Or like this:
  • 6. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Nomenclature: • Shard: • Logical data partition • Each shard is handled by a server or replica set • Shard key: • Key that all documents MUST have • Decided by the user • Chunk: • Logical data partition inside a shard • They be split into 2 smaller chunks • They can be moved to another shard for balancing
  • 7. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 What does each component do? • Mongos processes route data • Config servers hold metadata: • What chunks are there • What shard holds each chunk • Which chunks are being migrated • The shard servers hold the actual data
  • 8. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 How does it work? Whenever you read/write data this happens: 1. You run your query in your shell/driver 2. Your driver contacts the mongos process (a proxy) 3. The mongos process retrieves metadata from the config servers 4. Based on the metadata, asks the shards affected by the query to run their part of the job 5. Mongos returns the result
  • 9. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Data partitioning Your data will be split in chunks based on your shard key:
  • 10. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Choosing a good shard key In order to get a good shard key it has to: • Be used in ALL queries • Allow a huge amount of possible values: • Sha1 hash -> good • Phone number -> not bad • Zip code -> bad • Boolean -> awful • Have values evenly distributed across all the key space If your shard key has a big cardinality, but it’s not evenly distributed across the key space: use a hashed shard key
  • 11. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Chunk partitioning Whenever a chunk reaches certain size, the mongos process will try to split it into two: This will fail if all docs in this chunk belong to the same shard key value
  • 12. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Balancing • Inevitably, some shards will get more chunks than others • The sharded cluster will automatically move chunks from crowded shards to under-populated shards: • It’s possible to start/stop and customize the balancing algorithm • It’s possible to manually move chunks around
  • 13. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 HA in a sharded cluster In order to achieve HA in a sharded cluster you’ll need: • 3 config servers: • As long as 1 is up you’ll be able to read/write into the collection • If a config server is down the metadata collection will be read- only, so you won’t be able to: • Split chunks • Balance the cluster • Add shards • N shards; each one with, at least: • 2 data bearing-nodes • An arbiter or another data-bearing node
  • 14. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Demo time! Creating a new demo sharded cluster: sudo service mongod stop mkdir shard0 mkdir shard1 mkdir config # Start the config server mongod --fork --syslog --configsvr --dbpath config --port 27019 # Start the shard servers mongod --fork --syslog --dbpath shard0 --port 30000 mongod --fork --syslog --dbpath shard1 --port 30001 # Start the mongos process mongos --fork --syslog --configdb localhost:27019 # Add shards mongo initSharding.js
  • 15. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Demo time! Creating a new demo sharded cluster: //Creating shards sh.addShard("localhost:30000"); sh.addShard("localhost:30001"); //Adding test data for (i = 0; i < 10000; i++) { db.testdata.insert({"i": i}) } //Creating index db.testdata.createIndex({"i": 1}); //Enabling sharding sh.enableSharding("test") sh.shardCollection("test.testdata", {i:1}) //Manually splitting chunks for(i = 1; i < 20; i++) { sh.splitAt("test.testdata", {"i": i*500}) } //Status print(sh.status(true));
  • 16. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Questions?
  • 17. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 We’re hiring! We’re looking for awesome engineers! Talk to me after the presentation or go to: https://www.serverdensity.com/jobs/
  • 18. Jordi Soucheiron - @jordixou Barcelona MongoDB User Group – 2015-06-29 Code https://github.com/jsoucheiron/mongodb-barcelona-sharding-introduction Slides http://www.slideshare.net/jordixou (soon)