Consulting Engineer, 10genJason Zucchetto#MongoSFSchema Design
Single Table EnAgenda• Working with documents• Evolving a Schema• Queries and Indexes• Common Patterns
TerminologyRDBMS MongoDBDatabase ➜ DatabaseTable ➜ CollectionRow ➜ DocumentIndex ➜ IndexJoin ➜ Embedded DocumentForeign Ke...
Working withDocuments
Modeling Data
DocumentsProvide flexibility andperformance
Normalized Data
De-Normalized (embedded)Data
Relational Schema DesignFocus on data storage
Document Schema DesignFocus on data use
Data Access• Flexible Schemas• Ability to embed complex data structures• Secondary Indexes• Multi-Key Indexes• Aggregation...
Getting Started
Library ManagementApplication• Patrons• Books• Authors• Publishers
An ExampleOne to One Relations
patron = {_id: "joe",name: "Joe Bookreader”}address = {patron_id = "joe",street: "123 Fake St. ",city: "Faketon",state: "M...
One to One Relations• Mostly the same as the relational approach• Generally good idea to embed “contains”relationships• Do...
An ExampleOne To Many Relations
patron = {_id: "joe",name: "Joe Bookreader",join_date: ISODate("2011-10-15"),addresses: [{street: "1 Vernon St.", city: "N...
Publishers and Books• Publishers put out many books• Books have one publisher
MongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: EnglishPublisher...
book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("20...
publisher = {name: "O’Reilly Media",founded: "1980",location: "CA"}book = {title: "MongoDB: The Definitive Guide",authors:...
publisher = {_id: "oreilly",name: "O’Reilly Media",founded: "1980",location: "CA"}book = {title: "MongoDB: The Definitive ...
publisher = {name: "O’Reilly Media",founded: "1980",location: "CA"books: [ "123456789", ... ]}book = {_id: "123456789",tit...
Where Do You Put the ForeignKey?• Array of books inside of publisher– Makes sense when many means a handful of items– Usef...
Another ExampleOne to Many Relations
Books and Patrons• Book can be checked out by one Patron at a time• Patrons can check out many books (but not1000’s)
patron = {_id: "joe",name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... }}book = {_id: "123456789",tit...
patron = {_id: "joe",name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... },checked_out: [{ _id: "123456...
De-normalize for speedDenormalizationProvides data locality
patron = {_id: "joe",name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... },checked_out: [{ _id: "123456...
Referencing vs. Embedding• Embedding is a bit like pre-joined data• Document-level ops are easy for server tohandle• Embed...
An ExampleSingle Table Inheritance
book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("20...
An ExampleMany to Many Relations
book = {title: "MongoDB: The Definitive Guide",authors = [{ _id: "kchodorow", name: "K-Awesome" },{ _id: "mdirolf", name: ...
An ExampleTrees
book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("20...
book = {_id: 123456789,title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_da...
Modeling Trees• Parent Links- Each node is stored as a document- Contains the id of the parent• Child Links- Each node con...
book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("20...
An ExampleQueues
book = {_id: 123456789,title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_da...
Consulting Engineer, 10genJason Zucchetto#MongoSFThank You
Upcoming SlideShare
Loading in …5
×

MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

2,751 views

Published on

MongoDB’s basic unit of storage is a document. Documents can represent rich, schema-free data structures, meaning that we have several viable alternatives to the normalized, relational model. In this talk, we’ll discuss the tradeoff of various data modeling strategies in MongoDB using a library as a sample application. You will learn how to work with documents, evolve your schema, and common schema design patterns.

Published in: Technology, Business
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,751
On SlideShare
0
From Embeds
0
Number of Embeds
1,654
Actions
Shares
0
Downloads
52
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide
  • In the filing cabinet model, the patient’s x-rays, checkups, and allergies are stored in separate drawers and pulled together (like an RDBMS)In the file folder model, we store all of the patient information in a single folder (like MongoDB)
  • Flexibility – Ability to represent rich data structuresPerformance – Benefit from data locality
  • Concrete example of typical blog in typical relational normalized form
  • Concrete example of typical blog using a document oriented de-normalized approach
  • Tools for data access
  • Slow to get address data every time you query for a user. Requires an extra operation.
  • Publisher is repeated for every book, data duplication!
  • Publisher is better being a separate entity and having its own collection.
  • Now to create a relation between the two entities, you can choose to reference the publisher from the book document.This is similar to the relational approach for this very same problem.
  • OR: because we are using MongoDB and documents can have arrays you can choose to model the relation by creating and maintaining an array of books within each publisher entity.Careful with mutable, growing arrays. See next slide.
  • Costly for a small number of books because to get the publisher
  • And data locality provides speed
  • Book’s kind attribute could be local or loanableNote that we have locations for loanable books but not for localNote that these two separate schemas can co-exist (loanable books / local books are both books)
  • Authors often use pseudonyms for a book even though it’s the same individualTo get books by a particular author: - get the author - get books that have that author id in array
  • MongoDB San Francisco 2013: Schema design presented by Jason Zucchetto, Consulting Engineer, 10ge

    1. 1. Consulting Engineer, 10genJason Zucchetto#MongoSFSchema Design
    2. 2. Single Table EnAgenda• Working with documents• Evolving a Schema• Queries and Indexes• Common Patterns
    3. 3. TerminologyRDBMS MongoDBDatabase ➜ DatabaseTable ➜ CollectionRow ➜ DocumentIndex ➜ IndexJoin ➜ Embedded DocumentForeign Key ➜ Reference
    4. 4. Working withDocuments
    5. 5. Modeling Data
    6. 6. DocumentsProvide flexibility andperformance
    7. 7. Normalized Data
    8. 8. De-Normalized (embedded)Data
    9. 9. Relational Schema DesignFocus on data storage
    10. 10. Document Schema DesignFocus on data use
    11. 11. Data Access• Flexible Schemas• Ability to embed complex data structures• Secondary Indexes• Multi-Key Indexes• Aggregation Framework– $project, $match, $limit, $skip, $sort, $group, $unwind• No Joins
    12. 12. Getting Started
    13. 13. Library ManagementApplication• Patrons• Books• Authors• Publishers
    14. 14. An ExampleOne to One Relations
    15. 15. patron = {_id: "joe",name: "Joe Bookreader”}address = {patron_id = "joe",street: "123 Fake St. ",city: "Faketon",state: "MA",zip: 12345}Modeling Patronspatron = {_id: "joe",name: "Joe Bookreader",address: {street: "123 Fake St. ",city: "Faketon",state: "MA",zip: 12345}}
    16. 16. One to One Relations• Mostly the same as the relational approach• Generally good idea to embed “contains”relationships• Document model provides a holisticrepresentation of objects
    17. 17. An ExampleOne To Many Relations
    18. 18. patron = {_id: "joe",name: "Joe Bookreader",join_date: ISODate("2011-10-15"),addresses: [{street: "1 Vernon St.", city: "Newton", state: "MA", …},{street: "52 Main St.", city: "Boston", state: "MA", …}]}Modeling Patrons
    19. 19. Publishers and Books• Publishers put out many books• Books have one publisher
    20. 20. MongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: EnglishPublisher: O’Reilly Media, CABook
    21. 21. book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English",publisher: {name: "O’Reilly Media",founded: "1980",location: "CA"}}Modeling Books – EmbeddedPublisher
    22. 22. publisher = {name: "O’Reilly Media",founded: "1980",location: "CA"}book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English"}Modeling Books & PublisherRelationship
    23. 23. publisher = {_id: "oreilly",name: "O’Reilly Media",founded: "1980",location: "CA"}book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English",publisher_id: "oreilly"}Publisher _id as a ForeignKey
    24. 24. publisher = {name: "O’Reilly Media",founded: "1980",location: "CA"books: [ "123456789", ... ]}book = {_id: "123456789",title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English"}Book _id as a Foreign Key
    25. 25. Where Do You Put the ForeignKey?• Array of books inside of publisher– Makes sense when many means a handful of items– Useful when items have bound on potential growth• Reference to single publisher on books– Useful when items have unbounded growth (unlimited # ofbooks)• SQL doesn’t give you a choice, no arrays
    26. 26. Another ExampleOne to Many Relations
    27. 27. Books and Patrons• Book can be checked out by one Patron at a time• Patrons can check out many books (but not1000’s)
    28. 28. patron = {_id: "joe",name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... }}book = {_id: "123456789",title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],...}Modeling Checkouts
    29. 29. patron = {_id: "joe",name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... },checked_out: [{ _id: "123456789", checked_out: "2012-10-15" },{ _id: "987654321", checked_out: "2012-09-12" },...]}Modeling Checkouts
    30. 30. De-normalize for speedDenormalizationProvides data locality
    31. 31. patron = {_id: "joe",name: "Joe Bookreader",join_date: ISODate("2011-10-15"),address: { ... },checked_out: [{ _id: "123456789",title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],checked_out: ISODate("2012-10-15")},{ _id: "987654321"title: "MongoDB: The Scaling Adventure",...}, ...]}Modeling Checkouts:Denormalized
    32. 32. Referencing vs. Embedding• Embedding is a bit like pre-joined data• Document-level ops are easy for server tohandle• Embed when the many objects always appearwith (i.e. viewed in the context of) their parent• Reference when you need more flexibility
    33. 33. An ExampleSingle Table Inheritance
    34. 34. book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),kind: "loanable",locations: [ ... ],pages: 216,language: "English",publisher: {name: "O’Reilly Media",founded: "1980",location: "CA"}}Single Table Inheritance
    35. 35. An ExampleMany to Many Relations
    36. 36. book = {title: "MongoDB: The Definitive Guide",authors = [{ _id: "kchodorow", name: "K-Awesome" },{ _id: "mdirolf", name: "Batman Mike" },]published_date: ISODate("2010-09-24"),pages: 216,language: "English"}author = {_id: "kchodorow",name: "Kristina Chodorow",hometown: "New York"}Books and Authors
    37. 37. An ExampleTrees
    38. 38. book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English",category: "MongoDB"}category = { _id: MongoDB, parent: "Databases" }category = { _id: Databases, parent: "Programming" }Parent Links
    39. 39. book = {_id: 123456789,title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English"}category = { _id: MongoDB, children: [ 123456789, … ] }category = { _id: Databases, children: ["MongoDB", "Postgres"}category = { _id: Programming, children: ["DB", "Languages"] }Child Links
    40. 40. Modeling Trees• Parent Links- Each node is stored as a document- Contains the id of the parent• Child Links- Each node contains the id’s of the children- Can support graphs (multiple parents / child)
    41. 41. book = {title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English",categories: ["Programming", "Databases", "MongoDB” ]}book = {title: "MySQL: The Definitive Guide",authors: [ "Michael Kofler" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English",parent: "MongoDB",ancestors: [ "Programming", "Databases", "MongoDB"]}Array of Ancestors
    42. 42. An ExampleQueues
    43. 43. book = {_id: 123456789,title: "MongoDB: The Definitive Guide",authors: [ "Kristina Chodorow", "Mike Dirolf" ],published_date: ISODate("2010-09-24"),pages: 216,language: "English",available: 3}db.books.findAndModify({query: { _id: 123456789, available: { "$gt": 0 } },update: { $inc: { available: -1 } }})Book Document
    44. 44. Consulting Engineer, 10genJason Zucchetto#MongoSFThank You

    ×