Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Availability and scalability in mongo

520 views

Published on

Published in: Software
  • Be the first to comment

  • Be the first to like this

Availability and scalability in mongo

  1. 1. SoftwarePeople Md Khairul Anam Introduction to Availability & Scalability in MongoDB
  2. 2. Availability
  3. 3. Replica Set – Creation
  4. 4. Replica Set – Initialize
  5. 5. Replica Set – Failure
  6. 6. Replica Set – Failover
  7. 7. Replica Set – Recovery
  8. 8. Replica Set – Recovered
  9. 9. Replica Set Roles • Heartbeats • Priority Comparisons • Optime • Connections • Networka Partitions Factors and Conditions that Affect Elections
  10. 10. Strong Consistency
  11. 11. Delayed Consistency
  12. 12. Maintenance and Upgrade • Rolling upgrade/maintenance – Start with Secondary – Primary last
  13. 13. Replica Set – 1 Data Center Single datacenter Single switch & power Points of failure: – Power – Network – Data center Automatic recovery of single node crash
  14. 14. Replica Set – 2 Data Centers Multi data center DR node for safety Can’t do multi data center durable write safely since only 1 node in distant DC
  15. 15. Replica Set – 3 Data Centers Three data centers Can survive full data center loss Can do w= { dc : 2 } to guarantee write in 2 data centers
  16. 16. Questions?
  17. 17. Scalability
  18. 18. User Growth – 1995: 0.4% of the world’s population – Today: 30% of the world is online (~2.2B) Data Set Growth – Facebook’s data set is around 100 petabytes – 4 billion photos taken in the last year (4x a decade ago Examining Growth
  19. 19. Read/Write Throughput Exceeds I/O Working Set In d e x e s D a t a Working Set Indexes Data Working Set Exceeds Physical Memory Vertical Scalability (Scale Up)
  20. 20. Horizontal Scalability (Scale Out)
  21. 21. Custom Hardware – Oracle Custom Software – Facebook + MySQL – Google MongoDB Auto-Sharding Adata store that is – Free – Publiclyavailable – Open Source(https://github.com/mongodb/mongo) – Horizontallyscalable – Applicationindependent Data Store Scalability Solutions
  22. 22. Sharded Cluster Architecture
  23. 23. • Shard is a node of the cluster • Shard can be a single mongod or a replica set What is a Shard?
  24. 24. Config Server – Stores cluster chunk ranges and locations – Can have only 1 or 3 (production must have 3) – Not a replica set Meta Data Storage
  25. 25. Mongos – Acts as a router / balancer – No local data (persists to config database) – Can have 1 or many Routing and Managing Data
  26. 26. • User defines shard key • Shard key defines range of data • Key space is like points on a line • Range is a segment of that line Partitioning
  27. 27. • Shard key is used to partition your collection • Shard key must exist in every document • Shard key must be indexed • Shard key is used to route requests to shards What is a Shard Key
  28. 28. Shards and Shard Keys Shard Shard key range
  29. 29. • Initially 1 chunk • Default max chunk size: 64mb • MongoDB automatically splits & migrates chunks when max reached Data Distribution
  30. 30. • Targeted Queries • Scatter Gather Queries • Scatter Gather Queries with Sort Cluster Request Routing
  31. 31. Questions?
  32. 32. Thank You

×