From Dev to Prod with Sharded Mongo DB ClustersMay 2013Johan AnderssonSeveralnines ABjohan@severalnines.com
Topics¤  Best Practice Deployment¤  Keeping it running¤  Scaling with ShardsNote: Most features mentioned in this prese...
3Copyright 2011 Severalnines AB“An OS check-up a day keeps downtime away”Ancient Chinese wisdom
Distributed Computing 101¤  Network latency is not 0¤  Network is unreliable¤  Network bandwidth is not infinite¤  CPU...
MongoDB Concepts¤  Mongos¤  Query routers, routes queries to shards¤  Caches data from Config Servers¤  Config Servers...
Developer Setup¤  A good test setup for a developer:¤  One or three config Servers¤  Thee shard servers¤  One or more ...
Test Setup¤  Recommendations¤  One or three config Servers¤  >1 shard - thee shard servers¤  One or more Mongos¤  Sta...
Production Setup¤  Start with a good and sound architecture from day one!¤  A sharded architecture will allow you to sca...
Production Setup9Copyright 2011 Severalnines ABPRIMARY SECONDARYrepl
Production Setup10Copyright 2011 Severalnines ABHard to scale, too crowdedGo for Best Practice insteadPRIMARY SECONDARYrepl
Production Setup11Copyright 2011 Severalnines AB…Shard0 Shard1 ShardN…
Production Setup¤  Allows you to scale - Add more shards and mongos easily¤  It is proven – used by many large organizat...
When processes fails¤  Config Server process fails¤  If one config server fails, no chunk migration between shards¤  No...
Keeping it running¤  RAM is a scarce resource – keep as much as possible in it!¤  Use short field names in the Documents...
Keeping it running¤  Replication Lag¤  Producer / Consumer problem¤  Temporary or Permanent Problem?¤  Permanent probl...
Keeping it running¤  Page Faults¤  MongoDB uses mmap:ed files, allocated from the VMM address space.Mmapped files does n...
Not cool17Copyright 2011 Severalnines AB
Not cool18Copyright 2011 Severalnines AB
You are done for!19Copyright 2011 Severalnines AB
Avoid the situation¤  Keep data set in RAM¤  Use fast io-subsystem since OS is managing the paging¤  SSD¤  IOPS on EC2...
Scaling with Shards¤  Shard Keys¤  When to scale?¤  Add a shard21Copyright 2011 Severalnines AB
Database Size¤  db.stats()¤  Size of one db = dbStats.dataSize + dbStats.indexSize¤  Does it fit in RAM?¤  Plan ahead:...
Working set¤  Working Set Analyzer in 2.4 helps:!"workingSet" : {!! !"note" : "thisIsAnEstimate",!! !"pagesInMemory" : 33...
Shard Key¤  Collections must have a Shard Key¤  The shard key must be indexed¤  Hash-based or Range based Sharding¤  H...
Adding Shards¤  Adding a new Shard is easy:¤  Deploy a new Replica-set:¤  PRIMARY - 192.168.0.151SECONDARY - 192.168.0....
Q&A26Copyright 2011 Severalnines AB
Resources27Copyright 2011 Severalnines AB¤  MongoDB Configuratorhttp://www.severalnines.com/mongodb-configurator/¤  -Clu...
Follow us on28Copyright 2011 Severalnines AB¤  Twitter : @severalnines¤  Facebook: www.facebook.com/severalnines¤  Slid...
Thank you!29Copyright 2011 Severalnines AB
Upcoming SlideShare
Loading in...5
×

Development to Production with Sharded MongoDB Clusters

4,266

Published on

Severalnines presentation at MongoDB Stockholm Conference.
Presentation covers:
- mongoDB sharding/clustering concepts
- recommended dev/test/prod setups
- how to verify your deployment
- how to avoid downtime
- what MongoDB metrics to watch
- when to scale

Published in: Technology
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,266
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
64
Comments
0
Likes
10
Embeds 0
No embeds

No notes for slide

Development to Production with Sharded MongoDB Clusters

  1. 1. From Dev to Prod with Sharded Mongo DB ClustersMay 2013Johan AnderssonSeveralnines ABjohan@severalnines.com
  2. 2. Topics¤  Best Practice Deployment¤  Keeping it running¤  Scaling with ShardsNote: Most features mentioned in this presentation areapplicable only on MongoDB 2.2 or later.2Copyright 2011 Severalnines AB
  3. 3. 3Copyright 2011 Severalnines AB“An OS check-up a day keeps downtime away”Ancient Chinese wisdom
  4. 4. Distributed Computing 101¤  Network latency is not 0¤  Network is unreliable¤  Network bandwidth is not infinite¤  CPUs are not infinitely fast¤  Disks are not infinitely fast¤  Free RAM decreases fast¤  Virtualization does not make the underlying hardware morepowerful4Copyright 2011 Severalnines AB
  5. 5. MongoDB Concepts¤  Mongos¤  Query routers, routes queries to shards¤  Caches data from Config Servers¤  Config Servers¤  Records which shard holds what chunks¤  Shard Servers¤  Stores the actual data5Copyright 2011 Severalnines AB
  6. 6. Developer Setup¤  A good test setup for a developer:¤  One or three config Servers¤  Thee shard servers¤  One or more Mongos¤  But do set it up as a shard server so you can validate queries(explain) to make sure they behave as intended.¤  Using correct indexes¤  Vagrant image -http://alexyu.se/content/2013/03/vagrant-box-severalnines-database-cluster-deployment-scripts6Copyright 2011 Severalnines AB
  7. 7. Test Setup¤  Recommendations¤  One or three config Servers¤  >1 shard - thee shard servers¤  One or more Mongos¤  Stay close to your production system.¤  Two shards (even if prod use one only) allows you to:¤  Verify queries use the correct indexes¤  Check if one or all shards are hit¤  Extremely useful when adding shards to Prod to avoidperformance regression.7Copyright 2011 Severalnines AB
  8. 8. Production Setup¤  Start with a good and sound architecture from day one!¤  A sharded architecture will allow you to scale easily¤  Avoid un-necessary craziness and risk to switch a non-sharded setup to a sharded¤  You do want three copies of your data8Copyright 2011 Severalnines AB
  9. 9. Production Setup9Copyright 2011 Severalnines ABPRIMARY SECONDARYrepl
  10. 10. Production Setup10Copyright 2011 Severalnines ABHard to scale, too crowdedGo for Best Practice insteadPRIMARY SECONDARYrepl
  11. 11. Production Setup11Copyright 2011 Severalnines AB…Shard0 Shard1 ShardN…
  12. 12. Production Setup¤  Allows you to scale - Add more shards and mongos easily¤  It is proven – used by many large organizations¤  Deploy…¤  Mongos¤  Config servers¤  Shard servers (mongod --shardsvr)¤  … on individual servers12Copyright 2011 Severalnines AB
  13. 13. When processes fails¤  Config Server process fails¤  If one config server fails, no chunk migration between shards¤  No chunk splitting¤  Shards are still read/writeable¤  Mongos fails¤  You will have many¤  Shard server fails¤  If PRIMARY fails à One SECONDARY will become PRIMARY¤  If SECONDADY fails à Restart and it will resync with PRIMARY¤  Restart the failed process!¤  Troll the logs to see it starts without errors13Copyright 2011 Severalnines AB
  14. 14. Keeping it running¤  RAM is a scarce resource – keep as much as possible in it!¤  Use short field names in the Documents to reduce size:¤  { first_name:’not good’}¤  {fn:’better’}¤  Important when you have many Documents¤  Compact collections:¤  Run on SECONDARY¤  db.mycollection.runCommand(“compact”)¤  Stop PRIMARY, SECONDARY becomes new PRIMARY.14Copyright 2011 Severalnines AB
  15. 15. Keeping it running¤  Replication Lag¤  Producer / Consumer problem¤  Temporary or Permanent Problem?¤  Permanent problem¤  Expunge the oplog?¤  Magic wand?¤  Disk, network, or load-dependent problem?¤  Remedy:¤  Faster disks: Improve disk subsystem -> IOPS or SSD, RAID configuration¤  Faster CPUs¤  Can you reduce writes to the shard(s)?¤  Scale with shards15Copyright 2011 Severalnines AB
  16. 16. Keeping it running¤  Page Faults¤  MongoDB uses mmap:ed files, allocated from the VMM address space.Mmapped files does not need to fit in RAM¤  OS manages what pages are mapped in RAM and not¤  Accessing data on a page not in RAM cause a page fault¤  Data will be loaded from DISK¤  DISK is slow¤  Performance will degrade¤  If the number of PFs is constant, no worry, else if increasing, problem in thehorizon.¤  Remedy:¤  Increase RAM in server (Active working set should fit in RAM)¤  Scale with shards¤  Add SSD16Copyright 2011 Severalnines AB
  17. 17. Not cool17Copyright 2011 Severalnines AB
  18. 18. Not cool18Copyright 2011 Severalnines AB
  19. 19. You are done for!19Copyright 2011 Severalnines AB
  20. 20. Avoid the situation¤  Keep data set in RAM¤  Use fast io-subsystem since OS is managing the paging¤  SSD¤  IOPS on EC2¤  Make sure the Host OS (Virtualized env) does not swap.¤  Lock mongod’s to CPUs, check /etc/interrupts, bind to CPUsnot handling ETH interrupts, avoids ctx switching.¤  Add shards before its too late¤  More disks/CPUs,RAM to handle the loadCopyright 2011 Severalnines AB20
  21. 21. Scaling with Shards¤  Shard Keys¤  When to scale?¤  Add a shard21Copyright 2011 Severalnines AB
  22. 22. Database Size¤  db.stats()¤  Size of one db = dbStats.dataSize + dbStats.indexSize¤  Does it fit in RAM?¤  Plan ahead:¤  avgObjSize x expectedNoOfDocuments¤  Does it fit in RAM?¤  How many documents can you store?22Copyright 2011 Severalnines AB
  23. 23. Working set¤  Working Set Analyzer in 2.4 helps:!"workingSet" : {!! !"note" : "thisIsAnEstimate",!! !"pagesInMemory" : 33688, //4KB pg_size!! !"computationTimeMicros" : 15283,!! !"overSeconds" : 1923 //dist pg_new to pg_oldest
},!¤  Trend overSeconds and pagesInMemory23Copyright 2011 Severalnines AB
  24. 24. Shard Key¤  Collections must have a Shard Key¤  The shard key must be indexed¤  Hash-based or Range based Sharding¤  Hash-bashed great for exact match¤  Subscriber/user databases¤  Queries must use the shard key¤  If not, all shards will be queried à slow¤  Check and verify with .explain()24Copyright 2011 Severalnines AB
  25. 25. Adding Shards¤  Adding a new Shard is easy:¤  Deploy a new Replica-set:¤  PRIMARY - 192.168.0.151SECONDARY - 192.168.0.152SECONDARY - 192.168.0.153¤  Register the Replica Set:¤  sh.addShard(“myshard2/192.168.0.151:27017”);¤  Finding the time to add shards is harder¤  Rebalancing costs and query performance will suffer due tochunk migration.¤  Do it off-peak if possible¤  MongoDB will automatically rebalance the shards.25Copyright 2011 Severalnines AB
  26. 26. Q&A26Copyright 2011 Severalnines AB
  27. 27. Resources27Copyright 2011 Severalnines AB¤  MongoDB Configuratorhttp://www.severalnines.com/mongodb-configurator/¤  -ClusterControl for MongoDBhttp://www.severalnines.com/resources/cmon-mongodb-cluster-management-monitoring-tool¤  Installing ClusterControl on top of existing MongoDB Clusterhttp://www.severalnines.com/blog/install-clustercontrol-top-existing-mongodb-sharded-cluster¤  Vagrant image -http://alexyu.se/content/2013/03/vagrant-box-severalnines-database-cluster-deployment-scripts
  28. 28. Follow us on28Copyright 2011 Severalnines AB¤  Twitter : @severalnines¤  Facebook: www.facebook.com/severalnines¤  Slideshare : www.slideshare.net/severalnines¤  Linked-in: www.linkedin.com/company/severalnines
  29. 29. Thank you!29Copyright 2011 Severalnines AB
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×