MongoDB Basic Concepts                                    Norberto Leite                            Senior Solutions Archi...
Agenda        •Overview        •Replication        •Scalability        •Consistency & Durability        •Flexibility, Deve...
Your data needs started here...                                   http://bit.ly/OT71M4Thursday, 25 October 12
...but soon you had to be here                                   http://bit.ly/OxcsisThursday, 25 October 12
Basic Concepts                            Application    Document                                           Oriented      ...
Tradeoff: Scale vs Functionality                                    • memcached        scalability & performance          ...
ReplicationThursday, 25 October 12
Why do we need replication        •Failover        •Backups        •Secondary batch jobs        •High availabilityThursday...
Replica Sets        Data Availability across nodes        • Data Protection          • Multiple copies of the data        ...
Replica Sets                 App      Write                                   Primary                                     ...
Replica Sets                 App      Write                                   Primary                          Read       ...
Replica Sets                 App                                   Primary                          Write                 ...
Replica Sets                 App                                  Recovering                          Write               ...
Replica Sets                 App                                  Secondary                          Read                 ...
ScalabilityThursday, 25 October 12
Horizontal ScalabilityThursday, 25 October 12
Sharding        Data Distribution across nodes        • Data location transparent to your code        • Data distribution ...
Sharding - Range distribution                          sh.shardCollection("test.tweets", {_id: 1} , false)                ...
Sharding - Range distribution                          shard01   shard02   shard03                           a-i        j-...
Sharding - Splits                          shard01   shard02   shard03                           a-i      ja-jz      s-z  ...
Sharding - Splits                          shard01   shard02   shard03                           a-i       ja-ji     s-z  ...
Sharding - Auto Balancing                          shard01   shard02   shard03                           a-i       ja-ji  ...
Sharding - Auto Balancing                          shard01   shard02   shard03                           a-i       ja-ji  ...
Sharding - Routed Query                                              find({_id: "norberto"})                          shard...
Sharding - Routed Query                                              find({_id: "norberto"})                          shard...
Sharding - Scatter Gather                                         find({email: "norberto@10gen.com"})                      ...
Sharding - Scatter Gather                                         find({email: "norberto@10gen.com"})                      ...
Sharding - Caching                       96 GB Mem                      3:1 Data/Mem                          shard01     ...
Aggregate Horizontal Resources                       96 GB Mem      96 GB Mem      96 GB Mem                      1:1 Data...
Consistency & DurabilityThursday, 25 October 12
Two choices for consistency        •Eventual consistency                •Allow updates when a system has been partitioned ...
Durability        •For how long is my data available?        •When do I now that my data is safe?        •Where?        •M...
Data DurabilityThursday, 25 October 12
FlexibilityThursday, 25 October 12
Data Model        • Why JSON?                • Provides a simple, well understood                encapsulation of data    ...
Json        place1 = {         name : "10gen HQ",         address : "578 Broadway 7th Floor",         city : "New York",  ...
Schema Design        Relational DatabaseThursday, 25 October 12
Schema Design        MongoDB                     embedding                          linkingThursday, 25 October 12
Schemas in MongoDB     Design documents that simply map to     your application     post = {author: "Hergé",          date...
Embedding       > db.blogs.find( { author: "Hergé"} )         { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),           auth...
JSON & Scaleout        • Embedding removes need for                • Distributed Joins                • Two Phase commit  ...
http://bit.ly/UmUnsUThursday, 25 October 12
http://bit.ly/cnP77LThursday, 25 October 12
http://bit.ly/ODoMhhThursday, 25 October 12
http://bit.ly/uW2nkThursday, 25 October 12
download at mongodb.org!                                     norberto@10gen.com                          Support, Training...
Upcoming SlideShare
Loading in …5
×

MongoDB Fundamentals

1,779 views

Published on

From A Morning with MongoDB - Milan on October 24, 2012.

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,779
On SlideShare
0
From Embeds
0
Number of Embeds
349
Actions
Shares
0
Downloads
60
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

MongoDB Fundamentals

  1. 1. MongoDB Basic Concepts Norberto Leite Senior Solutions Architect, EMEA norberto@10gen.com @nleiteThursday, 25 October 12
  2. 2. Agenda •Overview •Replication •Scalability •Consistency & Durability •Flexibility, Developer ExperiencThursday, 25 October 12
  3. 3. Your data needs started here... http://bit.ly/OT71M4Thursday, 25 October 12
  4. 4. ...but soon you had to be here http://bit.ly/OxcsisThursday, 25 October 12
  5. 5. Basic Concepts Application Document Oriented High { author : “steve”, date : new Date(), text : “About MongoDB...”, Performance tags : [“tech”, “database”]} Fully Consistent Horizontally ScalableThursday, 25 October 12
  6. 6. Tradeoff: Scale vs Functionality • memcached scalability & performance •key/value • RDBMS depth of functionalityThursday, 25 October 12
  7. 7. ReplicationThursday, 25 October 12
  8. 8. Why do we need replication •Failover •Backups •Secondary batch jobs •High availabilityThursday, 25 October 12
  9. 9. Replica Sets Data Availability across nodes • Data Protection • Multiple copies of the data • Spread across Data Centers, AZs • High Availability • Automated Failover • Automated RecoveryThursday, 25 October 12
  10. 10. Replica Sets App Write Primary Asynchronous Read Replication Secondary Read Secondary ReadThursday, 25 October 12
  11. 11. Replica Sets App Write Primary Read Secondary Read Secondary ReadThursday, 25 October 12
  12. 12. Replica Sets App Primary Write Primary Automatic Election of new Primary Read Secondary ReadThursday, 25 October 12
  13. 13. Replica Sets App Recovering Write New primary serves Primary data Read Secondary ReadThursday, 25 October 12
  14. 14. Replica Sets App Secondary Read Write Primary Read Secondary ReadThursday, 25 October 12
  15. 15. ScalabilityThursday, 25 October 12
  16. 16. Horizontal ScalabilityThursday, 25 October 12
  17. 17. Sharding Data Distribution across nodes • Data location transparent to your code • Data distribution is automatic • Data re-distribution is automatic • Aggregate system resources horizontally • No code changesThursday, 25 October 12
  18. 18. Sharding - Range distribution sh.shardCollection("test.tweets", {_id: 1} , false) shard01 shard02 shard03Thursday, 25 October 12
  19. 19. Sharding - Range distribution shard01 shard02 shard03 a-i j-r s-zThursday, 25 October 12
  20. 20. Sharding - Splits shard01 shard02 shard03 a-i ja-jz s-z k-rThursday, 25 October 12
  21. 21. Sharding - Splits shard01 shard02 shard03 a-i ja-ji s-z ji-js js-jw jz-rThursday, 25 October 12
  22. 22. Sharding - Auto Balancing shard01 shard02 shard03 a-i ja-ji s-z ji-js js-jw js-jw jz-r jz-rThursday, 25 October 12
  23. 23. Sharding - Auto Balancing shard01 shard02 shard03 a-i ja-ji n-z ji-js js-jw jz-rThursday, 25 October 12
  24. 24. Sharding - Routed Query find({_id: "norberto"}) shard01 shard02 shard03 a-i ja-ji n-z ji-js js-jw jz-rThursday, 25 October 12
  25. 25. Sharding - Routed Query find({_id: "norberto"}) shard01 shard02 shard03 a-i ja-ji n-z ji-js js-jw jz-rThursday, 25 October 12
  26. 26. Sharding - Scatter Gather find({email: "norberto@10gen.com"}) shard01 shard02 shard03 a-i ja-ji n-z ji-js js-jw jz-rThursday, 25 October 12
  27. 27. Sharding - Scatter Gather find({email: "norberto@10gen.com"}) shard01 shard02 shard03 a-i ja-ji n-z ji-js js-jw jz-rThursday, 25 October 12
  28. 28. Sharding - Caching 96 GB Mem 3:1 Data/Mem shard01 a-i 300 GB Data j-r n-z 300 GBThursday, 25 October 12
  29. 29. Aggregate Horizontal Resources 96 GB Mem 96 GB Mem 96 GB Mem 1:1 Data/Mem 1:1 Data/Mem 1:1 Data/Mem shard01 shard02 shard03 a-i j-r n-z 300 GB Data 100 GB 100 GB 100 GBThursday, 25 October 12
  30. 30. Consistency & DurabilityThursday, 25 October 12
  31. 31. Two choices for consistency •Eventual consistency •Allow updates when a system has been partitioned •Resolve conflicts later •Example: CouchDB, Cassandra •Immediate consistency •Limit the application of updates to a single master node for a given slice of data •Another node can take over after a failure is detected •Avoids the possibility of conflicts •Example: MongoDBThursday, 25 October 12
  32. 32. Durability •For how long is my data available? •When do I now that my data is safe? •Where? •Mongodb style •Fire and Forget •Get Last Error •Journal Sync •Replica SafeThursday, 25 October 12
  33. 33. Data DurabilityThursday, 25 October 12
  34. 34. FlexibilityThursday, 25 October 12
  35. 35. Data Model • Why JSON? • Provides a simple, well understood encapsulation of data • Maps simply to the object in your OO language • Linking & Embedding to describe relationshipsThursday, 25 October 12
  36. 36. Json place1 = { name : "10gen HQ", address : "578 Broadway 7th Floor", city : "New York", zip : "10011", tags : [ "business", "tech" ] }Thursday, 25 October 12
  37. 37. Schema Design Relational DatabaseThursday, 25 October 12
  38. 38. Schema Design MongoDB embedding linkingThursday, 25 October 12
  39. 39. Schemas in MongoDB Design documents that simply map to your application post = {author: "Hergé", date: new Date(), text: "Destination Moon", tags: ["comic", "adventure"]} > db.posts.save(post)Thursday, 25 October 12
  40. 40. Embedding > db.blogs.find( { author: "Hergé"} ) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Hergé", date : ISODate("2011-09-18T09:56:06.298Z"), text : "Destination Moon", tags : [ "comic", "adventure" ], comments : [ ! { ! ! author : "Kyle", ! ! date : ISODate("2011-09-19T09:56:06.298Z"), ! ! text : "great book" ! } ] }Thursday, 25 October 12
  41. 41. JSON & Scaleout • Embedding removes need for • Distributed Joins • Two Phase commit • Enables data to be distributed across many nodes without penaltyThursday, 25 October 12
  42. 42. http://bit.ly/UmUnsUThursday, 25 October 12
  43. 43. http://bit.ly/cnP77LThursday, 25 October 12
  44. 44. http://bit.ly/ODoMhhThursday, 25 October 12
  45. 45. http://bit.ly/uW2nkThursday, 25 October 12
  46. 46. download at mongodb.org! norberto@10gen.com Support, Training, Consulting, Events, Meetups http://www.10gen.com Facebook! Twitter! LinkedIn! http://bit.ly/mongofb! http://twitter.com/mongodb! http://linkd.in/joinmongo!Thursday, 25 October 12

×