Modeling Data in MongoDB

10,802 views

Published on

Published in: Technology
0 Comments
33 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
10,802
On SlideShare
0
From Embeds
0
Number of Embeds
167
Actions
Shares
0
Downloads
0
Comments
0
Likes
33
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Modeling Data in MongoDB

    1. 1. Modeling Data in MongoDB Luke Ehresman http://copperegg.com
    2. 2. Schema Design
    3. 3. Schema DesignWait, isn’t MongoDB schemaless?
    4. 4. Schema DesignWait, isn’t MongoDB schemaless? Nope! (just no predefined schema)
    5. 5. Schema DesignWait, isn’t MongoDB schemaless? Nope! (just no predefined schema)That means it’s up to your application.
    6. 6. Schema Design (Relational)
    7. 7. Schema Design (Relational)• Tabular data - Tables, Rows, Columns
    8. 8. Schema Design (Relational)• Tabular data - Tables, Rows, Columns• Normalized - flatten your data
    9. 9. Schema Design (Relational)• Tabular data - Tables, Rows, Columns• Normalized - flatten your data• Columns with simple values (int, varchar)
    10. 10. Schema Design (Relational)• Tabular data - Tables, Rows, Columns• Normalized - flatten your data• Columns with simple values (int, varchar)• Relate rows with foreign key references
    11. 11. Schema Design (Relational)• Tabular data - Tables, Rows, Columns• Normalized - flatten your data• Columns with simple values (int, varchar)• Relate rows with foreign key references• Reuse, don’t repeat (i.e. person)
    12. 12. Schema Design (Relational)• Tabular data - Tables, Rows, Columns• Normalized - flatten your data• Columns with simple values (int, varchar)• Relate rows with foreign key references• Reuse, don’t repeat (i.e. person)• Indexes on values
    13. 13. Schema Design (MongoDB - Non-Relational)
    14. 14. Schema Design (MongoDB - Non-Relational)• Databases > Collections > Documents
    15. 15. Schema Design (MongoDB - Non-Relational)• Databases > Collections > Documents• Simple or complex values (ints, strings, objects, arrays)
    16. 16. Schema Design (MongoDB - Non-Relational)• Databases > Collections > Documents• Simple or complex values (ints, strings, objects, arrays)• Documents are monolithic units
    17. 17. Schema Design (MongoDB - Non-Relational)• Databases > Collections > Documents• Simple or complex values (ints, strings, objects, arrays)• Documents are monolithic units• Embedded complex data structures
    18. 18. Schema Design (MongoDB - Non-Relational)• Databases > Collections > Documents• Simple or complex values (ints, strings, objects, arrays)• Documents are monolithic units• Embedded complex data structures• No joins - repeat data for faster access
    19. 19. Schema Design (MongoDB - Non-Relational)• Databases > Collections > Documents• Simple or complex values (ints, strings, objects, arrays)• Documents are monolithic units• Embedded complex data structures• No joins - repeat data for faster access• Difficult to relate documents together
    20. 20. How will you use it?
    21. 21. How will you use it?• The best way to use MongoDB is to tailor your schema to how it will be used
    22. 22. How will you use it?• The best way to use MongoDB is to tailor your schema to how it will be used• Things to consider:
    23. 23. How will you use it?• The best way to use MongoDB is to tailor your schema to how it will be used• Things to consider: • minimize reads and/or writes
    24. 24. How will you use it?• The best way to use MongoDB is to tailor your schema to how it will be used• Things to consider: • minimize reads and/or writes • more writes, fewer reads? (read heavy)
    25. 25. How will you use it?• The best way to use MongoDB is to tailor your schema to how it will be used• Things to consider: • minimize reads and/or writes • more writes, fewer reads? (read heavy) • more reads, fewer writes? (write heavy)
    26. 26. How will you use it?
    27. 27. How will you use it?• Combine objects into one document if you will use them together.
    28. 28. How will you use it?• Combine objects into one document if you will use them together.• Example: Authors and Books
    29. 29. How will you use it?• Combine objects into one document if you will use them together.• Example: Authors and Books• Separate them if they need to be used separately -- but beware, no joins!
    30. 30. How will you use it?• Combine objects into one document if you will use them together.• Example: Authors and Books• Separate them if they need to be used separately -- but beware, no joins!• Or duplicate the data -- but beware!
    31. 31. Precompute!
    32. 32. Precompute!• Philosophy: do work before reads occur
    33. 33. Precompute!• Philosophy: do work before reads occur• Disk space is cheap - compute time is not (it’s expensive because users wait)
    34. 34. Precompute!• Philosophy: do work before reads occur• Disk space is cheap - compute time is not (it’s expensive because users wait)• Do joins on write, not on read
    35. 35. Precompute!• Philosophy: do work before reads occur• Disk space is cheap - compute time is not (it’s expensive because users wait)• Do joins on write, not on read• Do complex aggregation ahead of time
    36. 36. Precompute!• Philosophy: do work before reads occur• Disk space is cheap - compute time is not (it’s expensive because users wait)• Do joins on write, not on read• Do complex aggregation ahead of time• Optimize for specific use cases
    37. 37. Precompute!• Philosophy: do work before reads occur• Disk space is cheap - compute time is not (it’s expensive because users wait)• Do joins on write, not on read• Do complex aggregation ahead of time• Optimize for specific use cases• Delayed data is not always bad in real life
    38. 38. Aggregation
    39. 39. Aggregation• Application
    40. 40. Aggregation• Application• MapReduce (BEWARE!)
    41. 41. Aggregation• Application• MapReduce (BEWARE!)• Group
    42. 42. Aggregation• Application• MapReduce (BEWARE!)• Group• Aggregation framework (coming in 2.2)
    43. 43. Atomicity
    44. 44. Atomicity• MongoDB does have atomic transactions
    45. 45. Atomicity• MongoDB does have atomic transactions• Scope is a single document
    46. 46. Atomicity• MongoDB does have atomic transactions• Scope is a single document• Keep this in mind when designing schemas
    47. 47. Atomicity
    48. 48. Atomicity• $inc
    49. 49. Atomicity• $inc• $push
    50. 50. Atomicity• $inc• $push• $addToSet
    51. 51. Atomicity• $inc• $push• $addToSet• upsert (create-if-none-else-update)
    52. 52. Atomicity• Upsert example db.stats.update({_id: ‘lehresman’}, {$inc: {logins: 1}, $set: {last_login: new Date()}}, true);• {_id:‘lehresman’, logins:1, last_login:A}• {_id:‘lehresman’, logins:2, last_login:B}
    53. 53. Example: Books• Many books• Many authors• Authors write many books
    54. 54. Example: Books Bad N oSQL• Many books Ex ample!!• Many authors• Authors write many books
    55. 55. Example: User Stats• You have users• Track what pages they visit
    56. 56. Example: User Stats“users” collection{ _id: ‘lehresman’, first_name: ‘Luke’, last_name: ‘Ehresman’, page_visits: { ‘/’: 78, ‘/profile’: 33, ‘/blog/38919’: 2 } Problem: What if you want} aggregate stats across users?
    57. 57. Example: User Stats“visits” collection{ _id: ‘/’, visits: 73889 }{ _id: ‘/profile’, visits: 9341 }{ _id: ‘/blog/38919’ visits: 1678 }
    58. 58. Example: User Stats“visits” collection{ _id: ‘/’, visits: 73889 }{ _id: ‘/profile’, visits: 9341 }{ _id: ‘/blog/38919’ Problems: visits: 1678 } No user tracking; What if you want aggregate stats by day?
    59. 59. Example: User Stats“visits” collection{ _id: ‘/’, visits: 73889, { ‘2012-06-01’: 839, ‘2012-06-02’: 767, ‘2012-06-03’: 881 }
    60. 60. Example: User Stats“visits” collection{ _id: ‘/’, visits: 73889, { ‘2012-06-01’: 839, ‘2012-06-02’: 767, ‘2012-06-03’: 881 } Problems: No user tracking; Possibly too large eventually. Always grows.
    61. 61. Example: User Stats“visits” collection{ date: ‘2012-06-01’, page: ‘/’, visits: 839, users: { ‘lehresman’: 78, ‘billybob’: 761 }}
    62. 62. Example: User Stats“visits” collection{ date: ‘2012-06-01’, page: ‘/’, visits: 839, users: { ‘lehresman’: 78, ‘billybob’: 761 }} No relational integrity. (up to your application to handle null cases)

    ×