Your SlideShare is downloading. ×

MongoDB Schema Design

7,238

Published on

An overview of MongoDB Schema Design from M

An overview of MongoDB Schema Design from M

Published in: Business, Technology
0 Comments
28 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,238
On Slideshare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
287
Comments
0
Likes
28
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Flexibility – Ability to represent rich data structures Performance – Benefit from data locality
  • Concrete example of typical blog using a document oriented de-normalized approach
  • Tools for data access
  • Tools for data manipulation
  • Slow to get address data every time you query for a user. Requires an extra operation.
  • Patron may have two addresses, in this case, you would need a separate table in a relation database With MongoDB, you simply start storing the address field as an array Only patrons which have multiple addresses could have this schema! No migration necessary! but Caution: Additional application logic required!
  • Publisher is repeated for every book, data duplication!
  • Publisher is better being a separate entity and having its own collection.
  • Now to create a relation between the two entities, you can choose to reference the publisher from the book document. This is similar to the relational approach for this very same problem.
  • OR: because we are using MongoDB and documents can have arrays you can choose to model the relation by creating and maintaining an array of books within each publisher entity. Careful with mutable, growing arrays. See next slide.
  • Costly for a small number of books because to get the publisher
  • And data locality provides speed
  • tie back to examples, give some concrete scenarios
  • Authors often use pseudonyms for a book even though it’s the same individual To get books by a particular author: - get the author - get books that have that author id in array
  • To get the authors given a book: - Single query To get books by a particular author: - get the author id - get books that have that author id in array
  • Getting the title of book published by an author is a single query Getting the authors of a book. 2 queries Get the book id Query the author for books in the id
  • Transcript

    • 1. Emily Stolfo#mongodbdaysSchema DesignRuby Engineer/Evangelist, 10gen@EmStolfo
    • 2. Agenda• Working with documents• Common patterns• Queries and Indexes
    • 3. TerminologyRDBMS MongoDBDatabase ➜ DatabaseTable ➜ CollectionRow ➜ DocumentIndex ➜ IndexJoin ➜ Embedded DocumentForeign Key ➜ Reference
    • 4. Working with Documents
    • 5. DocumentsProvide flexibility andperformance
    • 6. Example Schema (MongoDB)
    • 7. EmbeddingExample Schema (MongoDB)
    • 8. EmbeddingLinkingExample Schema (MongoDB)
    • 9. Relational Schema DesignFocuses on data storage
    • 10. Document Schema DesignFocuses on data use
    • 11. Schema Design Considerations• What is a priority?– High consistency– High read performance– High write performance• How does the application access and manipulatedata?– Read/Write Ratio– Types of Queries / Updates– Data life-cycle and growth– Analytics (Map Reduce, Aggregation)
    • 12. Tools for Data Access• Flexible Schemas• Embedded data structures• Secondary Indexes• Multi-Key Indexes• Aggregation Framework– Pipeline operators: $project, $match, $limit,$skip, $sort, $group, $unwind• No Joins
    • 13. Data Manipulation• Conditional Query Operators– Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt, $gte,$ne– Vector: $in, $nin, $all, $size• Atomic Update Operators– Scalar: $inc, $set, $unset– Vector: $push, $pop, $pull, $pushAll, $pullAll,$addToSet
    • 14. Schema DesignExample
    • 15. Library ManagementApplication• Patrons• Books• Authors• Publishers
    • 16. One to One Relationsexample
    • 17. patron = {_id: "joe"name: "Joe Bookreader”}address = {patron_id = "joe",street: "123 Fake St. ",city: "Faketon",state: "MA",zip: 12345}Modeling Patronspatron = {_id: "joe"name: "Joe Bookreader",address: {street: "123 Fake St. ",city: "Faketon",state: "MA",zip: 12345}}
    • 18. One to One Relations• “Contains” relationships are oftenembedded.• Document provides a holistic representationof objects with embedded entities.• Optimized read performance.
    • 19. examplesOne To Many Relations
    • 20. patron = {_id: "joe"name: "Joe Bookreader",join_date: ISODate("2011-10-15"),addresses: [{street: "1 Vernon St.", city: "Newton", state: "MA", …},{street: "52 Main St.", city: "Boston", state: "MA", …},]}Patrons with many addresses
    • 21. example 2Publishers and BooksOne to Many Relations
    • 22. Publishers and Books relation• Publishers put out many books• Books have one publisher
    • 23. MongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: EnglishPublisher: O’Reilly Media, CABook Data
    • 24. book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ]published_date: ISODate("2010-09-24"),pages: 216,language: "English",publisher: {name: "O’Reilly Media",founded: "1980",location: "CA"}}Book Model with Embedded Publisher
    • 25. publisher = {name: "O’Reilly Media",founded: "1980",location: "CA"}book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ]published_date: ISODate("2010-09-24"),pages: 216,language: "English"}Book Model with Normalized Publisher
    • 26. publisher = {_id: "oreilly",name: "O’Reilly Media",founded: "1980",location: "CA"}book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ]published_date: ISODate("2010-09-24"),pages: 216,language: "English",publisher_id: "oreilly"}Link with Publisher _id as aReference
    • 27. publisher = {name: "O’Reilly Media",founded: "1980",location: "CA"books: [ "123456789", ... ]}book = {_id: "123456789",title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ]published_date: ISODate("2010-09-24"),pages: 216,language: "English"}Link with Book _ids as a Reference
    • 28. Where do you put the reference?• Reference to single publisher on books– Use when items have unbounded growth (unlimited # ofbooks)• Array of books in publisher document– Optimal when many means a handful of items– Use when there is a bound on potential growth
    • 29. example 3Books and PatronsOne to Many Relations
    • 30. Books and Patrons• Book can be checked out by one Patron at atime• Patrons can check out many books (but not1000s)
    • 31. patron = {_id: "joe"name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... }}book = {_id: "123456789"title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],...}Modeling Checkouts
    • 32. patron = {_id: "joe"name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... },checked_out: [{ _id: "123456789", checked_out: "2012-10-15" },{ _id: "987654321", checked_out: "2012-09-12" },...]}Modeling Checkouts
    • 33. De-normalizationProvides data locality
    • 34. patron = {_id: "joe"name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... },checked_out: [{ _id: "123456789",title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],checked_out: ISODate("2012-10-15")},{ _id: "987654321"title: "MongoDB: The Scaling Adventure", ...}, ...]}Modeling Checkouts - de-normalized
    • 35. Referencing vs. Embedding• Embedding is a bit like pre-joining data• Document level operations are easy for theserver to handle• Embed when the “many” objects alwaysappear with (viewed in the context of) theirparents.• Reference when you need more flexibilityHow does your application access andmanipulate data?
    • 36. exampleMany to Many Relations
    • 37. book = {title: "MongoDB: The Definitive Guide",published_date: ISODate("2010-09-24"),pages: 216,language: "English"}author = {_id: "kchodorow",name: "Kristina Chodorow",hometown: "New York"}author = {_id: "mdirolf",name: "Mike Dirolf",hometown: "Albany"}Books and Authors
    • 38. book = {title: "MongoDB: The Definitive Guide",authors : [{ _id: "kchodorow", name: "Kristina Chodorow” },{ _id: "mdirolf", name: "Mike Dirolf” }]published_date: ISODate("2010-09-24"),pages: 216,language: "English"}author = {_id: "kchodorow",name: "Kristina Chodorow",hometown: "New York"}author = {_id: "mdirolf",name: "Mike Dirolf",hometown: "Albany"}Relation stored in Bookdocument
    • 39. book = {_id: 123456789title: "MongoDB: The Definitive Guide",published_date: ISODate("2010-09-24"),pages: 216,language: "English"}author = {_id: "kchodorow",name: "Kristina Chodorow",hometown: "Cincinnati",books: [ {book_id: 123456789, title : "MongoDB: The Definitive Guide" }]}Relation stored in Authordocument
    • 40. book = {_id: 123456789title: "MongoDB: The Definitive Guide",authors = [ kchodorow, mdirolf ]published_date: ISODate("2010-09-24"),pages: 216,language: "English"}author = {_id: "kchodorow",name: "Kristina Chodorow",hometown: "New York",books: [ 123456789, ... ]}author = {_id: "mdirolf",name: "Mike Dirolf",hometown: "Albany",books: [ 123456789, ... ]}Relation stored in bothdocuments
    • 41. book = {title: "MongoDB: The Definitive Guide",authors : [{ _id: "kchodorow", name: "Kristina Chodorow” },{ _id: "mdirolf", name: "Mike Dirolf” }]published_date: ISODate("2010-09-24"),pages: 216,language: "English"}author = {_id: "kchodorow",name: "Kristina Chodorow",hometown: "New York"}db.books.find( { authors.name : "Kristina Chodorow" } )Where do you put the reference?Think about common queries
    • 42. Where do you put the reference?Think about indexesbook = {title: "MongoDB: The Definitive Guide",authors : [{ _id: "kchodorow", name: "Kristina Chodorow” },{ _id: "mdirolf", name: "Mike Dirolf” }]published_date: ISODate("2010-09-24"),pages: 216,language: "English"}author = {_id: "kchodorow",name: "Kristina Chodorow",hometown: "New York"}db.books.createIndex( { authors.name : 1 } )
    • 43. Summary• Schema design is different in MongoDB• Basic data design principals apply• Focus on how application accesses andmanipulates data• Evolve schema to meet changingrequirements• Application-level logic is important!
    • 44. Emily Stolfo#mongodbdaysThank YouRuby Engineer/Evangelist, 10gen@EmStolfo

    ×