Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Back to Basics: Build Something Big With MongoDB

1,693 views

Published on

Published in: Technology
  • Be the first to comment

Back to Basics: Build Something Big With MongoDB

  1. 1. Solution Architect, MongoDB Sam Weaver #MongoDBBasics ‘MongoDB Back to Basics’ Build something Big with MongoDB
  2. 2. Agenda • Replica Sets Lifecycle • Developing with Replica Sets • Scaling your database
  3. 3. Q&A • Virtual Genius Bar – Use chat to post questions – SolutionArchitecture / Support Team are on hand – Make use of them during the sessions!!!
  4. 4. Recap • Introduction to MongoDB • Thinking in documents
  5. 5. Deployment Considerations
  6. 6. Working Set Exceeds Physical Memory
  7. 7. Why Replication? • How many have faced node failures? • How many have been woken up from sleep to do a fail-over(s)? • How many have experienced issues due to network latency? • Different uses for data – Normal processing – Simple analytics
  8. 8. Replica Set Lifestyle
  9. 9. Replica Set – Creation
  10. 10. Replica Set – Initialize
  11. 11. Replica Set – Failure
  12. 12. Replica Set – Failover
  13. 13. Replica Set – Recovery
  14. 14. Replica Set – Recovered
  15. 15. Developing with Replica Sets
  16. 16. Strong Consistency
  17. 17. Delayed Consistency
  18. 18. Write Concern • Network acknowledgement • Wait for error • Wait for journal sync • Wait for replication
  19. 19. Unacknowledged
  20. 20. MongoDB Acknowledged (wait for error)
  21. 21. Wait for Journal Sync
  22. 22. Wait for Replication
  23. 23. Tagging • Control where data is written to, and read from • Each member can have one or more tags – tags: {dc: "ny"} – tags: {dc: "ny", subnet: "192.168", rack: "row3rk7"} • Replica set defines rules for write concerns • Rules can change without changing app code
  24. 24. { _id : "mySet", members : [ {_id : 0, host : "A", tags : {"dc": "ny"}}, {_id : 1, host : "B", tags : {"dc": "ny"}}, {_id : 2, host : "C", tags : {"dc": "sf"}}, {_id : 3, host : "D", tags : {"dc": "sf"}}, {_id : 4, host : "E", tags : {"dc": "cloud"}}], settings : { getLastErrorModes : { allDCs : {"dc" : 3}, someDCs : {"dc" : 2}} } } > db.blogs.insert({...}) > db.runCommand({getLastError : 1, w : "someDCs"}) Tagging Example
  25. 25. Wait for Replication (Tagging)
  26. 26. Read Preference Modes • 5 modes – primary (only) - Default – primaryPreferred – secondary – secondaryPreferred – Nearest When more than one node is possible, closest node is used for reads (all modes but primary)
  27. 27. Tagged Read Preference • Custom read preferences • Control where you read from by (node) tags – E.g. { "disk": "ssd", "use": "reporting" } • Use in conjunction with standard read preferences – Except primary
  28. 28. Our application //connect to a replica set, with auto-discovery of the primary, supply a seed list of members MongoClient mongoClient = new MongoClient(Arrays.asList(new ServerAddress("localhost", 27017), new ServerAddress("localhost", 27018), new ServerAddress("localhost", 27019))); DB db = mongoClient.getDB( "mydb" );
  29. 29. Scaling
  30. 30. Working Set Exceeds Physical Memory
  31. 31. • When a specific resource becomes a bottle neck on a machine or replica set • RAM • Disk IO • Storage • Concurrency When to consider Sharding?
  32. 32. Vertical Scalability (Scale Up)
  33. 33. Horizontal Scalability (Scale Out)
  34. 34. Partitioning • User defines shard key • Shard key defines range of data • Key space is like points on a line • Range is a segment of that line
  35. 35. Initially 1 chunk Default max chunk size: 64mb MongoDB automatically splits & migrates chunks when max reached Data Distribution
  36. 36. Architecture
  37. 37. What is a Shard? • Shard is a node of the cluster • Shard can be a single mongod or a replica set
  38. 38. Meta Data Storage • Config Server – Stores cluster chunk ranges and locations – Can have only 1 or 3 (production must have 3) – Not a replica set
  39. 39. Routing and Managing Data • Mongos – Acts as a router / balancer – No local data (persists to config database) – Can have 1 or many
  40. 40. Sharding infrastructure
  41. 41. Cluster Request Routing • Targeted Queries • Scatter Gather Queries • Scatter Gather Queries with Sort
  42. 42. Cluster Request Routing: Targeted Query
  43. 43. Routable request received
  44. 44. Request routed to appropriate shard
  45. 45. Shard returns results
  46. 46. Mongos returns results to client
  47. 47. Cluster Request Routing: Non-Targeted Query
  48. 48. Non-Targeted Request Received
  49. 49. Request sent to all shards
  50. 50. Shards return results to mongos
  51. 51. Mongos returns results to client
  52. 52. Cluster Request Routing: Non-Targeted Query with Sort
  53. 53. Non-Targeted request with sort received
  54. 54. Request sent to all shards
  55. 55. Query and sort performed locally
  56. 56. Shards return results to mongos
  57. 57. Mongos merges sorted results
  58. 58. Mongos returns results to client
  59. 59. Shard Key
  60. 60. Shard Key • Shard key is immutable • Shard key values are immutable • Shard key must be indexed • Shard key limited to 512 bytes in size • Shard key used to route queries – Choose a field commonly used in queries • Only shard key can be unique across shards – `_id` field is only unique within individual shard
  61. 61. Summary
  62. 62. Things to remember • Size appropriately for your working set • Shard when you need to, not before • Pick a shard key wisely
  63. 63. Thank you

×