Successfully reported this slideshow.

Advanced Schema Design Patterns

14

Share

Loading in …3
×
1 of 41
1 of 41

Advanced Schema Design Patterns

14

Share

Speaker: Daniel Coupal
At this point, you may be familiar with the design of MongoDB databases and collections – but what are the frequent patterns you may have to model?
This presentation will add knowledge of how to represent common relationships (1-1, 1-N, N-N) in MongoDB. Going further than relationships, this presentation identifies a set of common patterns, in a similar way to what the Gang of Four did for Object Oriented Design. Finally, this presentation will guide you through the steps of modeling those patterns in MongoDB collections.
In this session, you will learn about:
How to create the appropriate MongoDB collections for some of the patterns discussed.
Differences in relationships vs. the relational database world, and how those differences translate to MongoDB collections.
Common patterns in developing applications with MongoDB, plus a specific vocabulary with which to refer to them.

Speaker: Daniel Coupal
At this point, you may be familiar with the design of MongoDB databases and collections – but what are the frequent patterns you may have to model?
This presentation will add knowledge of how to represent common relationships (1-1, 1-N, N-N) in MongoDB. Going further than relationships, this presentation identifies a set of common patterns, in a similar way to what the Gang of Four did for Object Oriented Design. Finally, this presentation will guide you through the steps of modeling those patterns in MongoDB collections.
In this session, you will learn about:
How to create the appropriate MongoDB collections for some of the patterns discussed.
Differences in relationships vs. the relational database world, and how those differences translate to MongoDB collections.
Common patterns in developing applications with MongoDB, plus a specific vocabulary with which to refer to them.

More Related Content

More from MongoDB

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Advanced Schema Design Patterns

  1. 1. O C T O B E R 1 6 , 2 0 1 7 | M O N G O D B W E B I N A R Advanced Schema Design Patterns
  2. 2. # M D B l o c a l { "name": "Daniel Coupal", "jobs_at_MongoDB": [ { "job": "Senior Curriculum Engineer", "from": new Date("2016-11") }, { "job": "Senior Technical Service Engineer", "from": new Date("2013-11") } ], "previous_jobs": [ "Consultant", "Developer", "Manager Quality & Tools Team", "Manager Software Team", "Tools Developer" ], "likes": [ "food", "beers", "movies", "MongoDB" ] } Who Am I?
  3. 3. # M D B l o c a l The "Gang of Four": A design pattern systematically names, explains, and evaluates an important and recurring design in object-oriented systems MongoDB systems can also be built using its own patterns PATTERN Pattern
  4. 4. # M D B l o c a l • Enable teams to use a common methodology and vocabulary when designing schemas for MongoDB • Giving you the ability to model schemas using building blocks • Less art and more methodology Why this Talk?
  5. 5. # M D B l o c a l Ensure: • Good performance • Scalability despite constraints ➡ • Hardware • RAM faster than Disk • Disk cheaper than RAM • Network latency • Reduce costs $$$ • Database Server • Maximum size for a document • Atomicity of a write • Data set • Size of data Why do we Create Models?
  6. 6. # M D B l o c a l •Don’t over-design! •Design for: •Performance •Scalability •Simplicity However …
  7. 7. # M D B l o c a l WMDB - World Movie Database Any events, characters and entities depicted in this presentation are fictional. Any resemblance or similarity to reality is entirely coincidental
  8. 8. # M D B l o c a l WMDB - World Movie Database First iteration 3 collections: A. movies B. moviegoers C. screenings
  9. 9. # M D B l o c a l Our mission, should we decide to accept it, is to fix this solution, so it can perform well and scale. As always, should I or anyone in the audience do it without training, WMDB will disavow any knowledge of our actions. This tape will self-destruct in five seconds. Good luck! Mission Possible
  10. 10. # M D B l o c a l Categories of Patterns • Frequency of Access • Subset ✓ • Approximation ✓ • Grouping • Computed ✓ • Overflow • Bucket • Representation • Attribute ✓ • Schema Versioning ✓ • Document Versioning • Tree • Pre-Allocation
  11. 11. # M D B l o c a l { title: "Moonlight", ... release_USA: "2016/09/02", release_Mexico: "2017/01/27", release_France: "2017/02/01", release_Festival_Mill_Valley: "2017/10/10" } Would need the following indexes: { release_USA: 1 } { release_Mexico: 1 } { release_France: 1 } ... { release_Festival_Mill_Valley: 1 } ... Issue #1: Big Documents, Many Fields and Many Indexes
  12. 12. # M D B l o c a l Pattern #1: Attribute { title: "Moonlight", ... release_USA: "2016/09/02", release_Mexico: "2017/01/27", release_France: "2017/02/01", release_Festival_Mill_Valley: "2017/10/10" }
  13. 13. # M D B l o c a l Problem: • Lots of similar fields • Common characteristic to search across those fields together • Fields present in only a small subset of documents Use cases: • Product attributes like ‘color’, ‘size’, ‘dimensions’, ... • Release dates of a movie in different countries, festivals Attribute Pattern
  14. 14. # M D B l o c a l Solution: • Field pairs in an array Benefits: • Allow for non deterministic list of attributes • Easy to index { "releases.location": 1, "releases.date": 1 } • Easy to extend with a qualifier, for example: { descriptor: "price", qualifier: "euros", value: Decimal(100.00) } Attribute Pattern - Solution
  15. 15. # M D B l o c a l Possible solutions: A. Reduce the size of your working set B. Add more RAM per machine C. Start sharding or add more shards Issue #2: Working Set doesn’t fit in RAM
  16. 16. # M D B l o c a l WMDB - World Movie Database First iteration 3 collections: A. movies B. moviegoers C. screenings
  17. 17. # M D B l o c a l In this example, we can: • Limit the list of actors and crew to 20 • Limit the embedded reviews to the top 20 • … Pattern #2: Subset
  18. 18. # M D B l o c a l Problem: • There is a 1-N or N-N relationship, and only few documents always need to be shown • Only infrequently do you need to pull all of the depending documents Use cases: • Main actors of a movie • List of reviews or comments Subset Pattern
  19. 19. # M D B l o c a l Solution: • Keep duplicates of a small subset of fields in the main collection Benefits: • Allows for fast data retrieval and a reduced working set size • One query brings all the information needed for the "main page" Subset Pattern - Solution
  20. 20. # M D B l o c a l Issue #3: Lot of CPU Usage
  21. 21. # M D B l o c a l { title: "Your Name", ... viewings: 5,000 viewers: 385,000 revenues: 5,074,800 } Issue #3: ..caused by repeated calculations
  22. 22. # M D B l o c a l For example: • Apply a sum, count, ... • rollup data by minute, hour, day • As long as you don’t mess with your source, you can recreate the rollups Pattern #3: Computed
  23. 23. # M D B l o c a l Problem: • There is data that needs to be computed • The same calculations would happen over and over • Reads outnumber writes: • example: 1K writes per hour vs 1M read per hour Use cases: • Have revenues per movie showing, want to display sums • Time series data, Event Sourcing Computed Pattern
  24. 24. # M D B l o c a l Solution: • Apply a computation or operation on data and store the result Benefits: • Avoid re-computing the same thing over and over • Replaces a view Computed Pattern - Solution
  25. 25. # M D B l o c a l Issue #4: Lots of Writes Web page counters Updates on movie data Screenings Other
  26. 26. # M D B l o c a l Issue #4: … for non critical data
  27. 27. # M D B l o c a l • Only increment once in X iterations • Increment by X Pattern #4: Approximation
  28. 28. # M D B l o c a l Web page counters Updates on movie data Screenings Other
  29. 29. # M D B l o c a l Problem: • Data is difficult to calculate correctly • May be too expensive to update the document every time to keep an exact count • No one gives a damn if the number is exact Use cases: • Population of a country • Web site visits Approximation Pattern
  30. 30. # M D B l o c a l Solution: • Fewer stronger writes Benefits: • Less writes, reducing contention on some documents Approximation Pattern – Solution
  31. 31. # M D B l o c a l • Keeping track of the schema version of a document Issue #5: Need to change the list of fields in the documents
  32. 32. # M D B l o c a l Add a field to track the schema version number, per document Does not have to exist for version 1 Pattern #5: Schema Versioning
  33. 33. # M D B l o c a l Problem: • Updating the schema of a database is: • Not atomic • Long operation • May not want to update all documents, only do it on updates Use cases: • Practically any database that will go to production Schema Versioning Pattern
  34. 34. # M D B l o c a l Solution: • Have a field keeping track of the schema version Benefits: • Don't need to update all the documents at once • May not have to update documents until their next modification Schema Versioning Pattern – Solution
  35. 35. # M D B l o c a l • How duplication is handled A. Update both source and target in real time B. Update target from source at regular intervals. Examples: • Most popular items => update nightly • Revenues from a movie => update every hour • Last 10 reviews => update hourly? daily? Aspect of Patterns: Consistency
  36. 36. # M D B l o c a l • Bucket • grouping documents together, to have less documents • Document Versioning • tracking of content changes in a document • Outlier • Avoid few documents drive the design, and impact performance for all • Tree(s) • Pre-allocation Other Patterns
  37. 37. #MDBW17 BACK to reality
  38. 38. # M D B l o c a l • Simple grouping from tables to collections is not optimal • Learn a common vocabulary for designing schemas with MongoDB • Use patterns as "plug-and-play" for your future designs • Attribute • Subset • Computed • Approximation • Schema Versioning Take Aways
  39. 39. # M D B l o c a l A full design example for a given problem: • E-commerce site • Contents Management System • Social Networking • Single view • … References for complete Solutions
  40. 40. # M D B l o c a l • More patterns in a follow up to this presentation • MongoDB in-person training courses on Schema Design • Upcoming Online course at MongoDB University: • https://university.mongodb.com • M220 Data Modeling How Can I Learn More About Schema Design?
  41. 41. # M D B l o c a l daniel.coupal@mongodb.com Thank You for using MongoDB!

×