#MongoBostonSchema DesignChad TindelSenior Solution Architect, 10gen
Agenda• Working with documents• Evolving a Schema• Queries and Indexes• Common Patterns                      Single Table En
RDBMS                MongoDBDatabase      ➜ DatabaseTable         ➜ CollectionRow           ➜ DocumentIndex         ➜ Inde...
Working withDocuments
Modeling Data
DocumentsProvide flexibility andperformance
Normalized Data
Disk seeks and data locality    Seek = 5+ ms          Read = really really fast                   Post                    ...
De-Normalized (embedded)Data
Disk seeks and data locality           Article             Author              Comment               Tag             Comme...
Relational Schema DesignFocus on data storage
Document Schema DesignFocus on data use
Schema DesignConsiderations• How do we manipulate the data?   – Dynamic Ad-Hoc Queries   – Atomic Updates   – Map Reduce• ...
Data Manipulation• Conditional Query Operators  – Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt, $gte, $ne  – Vector: ...
Data Access• Flexible Schemas• Ability to embed complex data structures• Secondary Indexes• Compound Indexes• Multi-Value ...
Getting Started
Library ManagementApplication• Patrons• Books• Authors• Publishers
An ExampleOne to One Relations
Modeling Patronspatron = {                   patron = {  _id: "joe"                   _id: "joe"  name: "Joe Bookreader”  ...
One to One Relations• Mostly the same as the relational approach• Generally good idea to embed “contains” relationships• D...
An ExampleOne To Many Relations
Modeling Patronspatron = {  _id: "joe"  name: "Joe Bookreader",  join_date: ISODate("2011-10-15"),  addresses: [    {stree...
Publishers and Books• Publishers put out many books• Books have one publisher
BookMongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: EnglishPubli...
Modeling Books – EmbeddedPublisherbook = {  title: "MongoDB: The Definitive Guide",  authors: [ "Kristina Chodorow", "Mike...
Modeling Books & PublisherRelationshippublisher = {  name: "O’Reilly Media",  founded: "1980",  location: "CA"}book = {  t...
Publisher _id as a ForeignKeypublisher = {  _id: "oreilly",  name: "O’Reilly Media",  founded: "1980",  location: "CA"}boo...
Book _id as a Foreign Keypublisher = {  name: "O’Reilly Media",  founded: "1980",  location: "CA"  books: [ "123456789", ....
Where do you put the foreignKey?• Array of books inside of publisher   – Makes sense when many means a handful of items   ...
Another ExampleOne to Many Relations
Books and Patrons• Book can be checked out by one Patron at a time• Patrons can check out many books (but not 1000s)
Modeling Checkoutspatron = {  _id: "joe"  name: "Joe Bookreader",  join_date: ISODate("2011-10-15"),  address: { ... }}boo...
Modeling Checkoutspatron = {    _id: "joe"    name: "Joe Bookreader",    join_date: ISODate("2011-10-15"),    address: { ....
De-normalizationProvides data locality           De-normalize for speed
Modeling Checkouts -- de-normalizedpatron = {  _id: "joe"  name: "Joe Bookreader",  join_date: ISODate("2011-10-15"),  add...
Referencing vs. Embedding• Embedding is a bit like pre-joined data• Document level ops are easy for server to handle• Embe...
An ExampleSingle Table Inheritance
Single Table Inheritancebook = {  title: "MongoDB: The Definitive Guide",  authors: [ "Kristina Chodorow", "Mike Dirolf" ]...
An ExampleMany to Many Relations
Books and Authorsbook = {  title: "MongoDB: The Definitive Guide",  authors = [      { _id: "kchodorow", name: ”Kristina C...
An ExampleTrees
Modeling Trees• Parent Links  - Each node is stored as a document  - Contains the id of the parent• Child Links  - Each no...
Parent Linksbook = {  title: "MongoDB: The Definitive Guide",  authors: [ "Kristina Chodorow", "Mike Dirolf" ],  published...
Child Linksbook = {  _id: 123456789,  title: "MongoDB: The Definitive Guide",  authors: [ "Kristina Chodorow", "Mike Dirol...
Array of Ancestorsbook = {  title: "MongoDB: The Definitive Guide",  authors: [ "Kristina Chodorow", "Mike Dirolf" ],  pub...
An ExampleQueues
Book Documentbook = {  _id: 123456789,  title: "MongoDB: The Definitive Guide",  authors: [ "Kristina Chodorow", "Mike Dir...
#ConferenceHashTagThank YouSpeaker NameJob Title, 10gen
Upcoming SlideShare
Loading in …5
×

Schema design mongo_boston

504 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
504
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
16
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • In the filing cabinet model, the patient’s x-rays, checkups, and allergies are stored in separate drawers and pulled together (like an RDBMS)In the file folder model, we store all of the patient information in a single folder (like MongoDB)
  • Flexibility – Ability to represent rich data structuresPerformance – Benefit from data locality
  • Concrete example of typical blog in typical relational normalized form
  • Concrete example of typical blog in typical relational normalized form
  • Concrete example of typical blog using a document oriented de-normalized approach
  • Concrete example of typical blog in typical relational normalized form
  • Tools for data manipulation
  • Tools for data access
  • Slow to get address data every time you query for a user. Requires an extra operation.
  • Patron may have two addresses, in this case, you would need a separate table in a relation databaseWith MongoDB, you simply start storing the address field as an arrayOnly patrons which have multiple addresses could have this schema!No migration necessary! but Caution: Additional application logic required!
  • Publisher is repeated for every book, data duplication!
  • Publisher is better being a separate entity and having its own collection.
  • Now to create a relation between the two entities, you can choose to reference the publisher from the book document.This is similar to the relational approach for this very same problem.
  • OR: because we are using MongoDB and documents can have arrays you can choose to model the relation by creating and maintaining an array of books within each publisher entity.Careful with mutable, growing arrays. See next slide.
  • Costly for a small number of books because to get the publisher
  • And data locality provides speed
  • Book’s kind attribute could be local or loanableNote that we have locations for loanable books but not for localNote that these two separate schemas can co-exist (loanable books / local books are both books)
  • Note that we partially de-normalize here.To get books by a particular author: - get the author - get books that have that author id in array
  • Simple solution. The biggest problem with this approach is getting an entire subtree requires several query turnarounds to the database
  • It may also be good for storing graphs where a node has multiple parents.
  • Schema design mongo_boston

    1. 1. #MongoBostonSchema DesignChad TindelSenior Solution Architect, 10gen
    2. 2. Agenda• Working with documents• Evolving a Schema• Queries and Indexes• Common Patterns Single Table En
    3. 3. RDBMS MongoDBDatabase ➜ DatabaseTable ➜ CollectionRow ➜ DocumentIndex ➜ IndexJoin ➜ Embedded DocumentForeign Key ➜ ReferenceTerminology
    4. 4. Working withDocuments
    5. 5. Modeling Data
    6. 6. DocumentsProvide flexibility andperformance
    7. 7. Normalized Data
    8. 8. Disk seeks and data locality Seek = 5+ ms Read = really really fast Post Comment Author Tags
    9. 9. De-Normalized (embedded)Data
    10. 10. Disk seeks and data locality Article Author Comment Tag Comment Comment Comment Comment Comment
    11. 11. Relational Schema DesignFocus on data storage
    12. 12. Document Schema DesignFocus on data use
    13. 13. Schema DesignConsiderations• How do we manipulate the data? – Dynamic Ad-Hoc Queries – Atomic Updates – Map Reduce• What are the access patterns of the application? – Read/Write Ratio – Types of Queries / Updates – Data life-cycle and growth rate
    14. 14. Data Manipulation• Conditional Query Operators – Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt, $gte, $ne – Vector: $in, $nin, $all, $size• Atomic Update Operators – Scalar: $inc, $set, $unset – Vector: $push, $pop, $pull, $pushAll, $pullAll, $addToSet
    15. 15. Data Access• Flexible Schemas• Ability to embed complex data structures• Secondary Indexes• Compound Indexes• Multi-Value Indexes (Indexes on Arrays)• Aggregation Framework – $project, $match, $limit, $skip, $sort, $group, $unwind• No Joins
    16. 16. Getting Started
    17. 17. Library ManagementApplication• Patrons• Books• Authors• Publishers
    18. 18. An ExampleOne to One Relations
    19. 19. Modeling Patronspatron = { patron = { _id: "joe" _id: "joe" name: "Joe Bookreader” name: "Joe Bookreader",} address: { street: "123 Fake St. ",address = { city: "Faketon", patron_id = "joe", state: "MA", street: "123 Fake St. ", city: "Faketon", zip: 12345 state: "MA", } zip: 12345 }}
    20. 20. One to One Relations• Mostly the same as the relational approach• Generally good idea to embed “contains” relationships• Document model provides a holistic representation of objects
    21. 21. An ExampleOne To Many Relations
    22. 22. Modeling Patronspatron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), addresses: [ {street: "1 Vernon St.", city: "Newton", state: "MA", …}, {street: "52 Main St.", city: "Boston", state: "MA", …}, ]}
    23. 23. Publishers and Books• Publishers put out many books• Books have one publisher
    24. 24. BookMongoDB: The Definitive Guide,By Kristina Chodorow and Mike DirolfPublished: 9/24/2010Pages: 216Language: EnglishPublisher: O’Reilly Media, CA
    25. 25. Modeling Books – EmbeddedPublisherbook = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}
    26. 26. Modeling Books & PublisherRelationshippublisher = { name: "O’Reilly Media", founded: "1980", location: "CA"}book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}
    27. 27. Publisher _id as a ForeignKeypublisher = { _id: "oreilly", name: "O’Reilly Media", founded: "1980", location: "CA"}book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly"}
    28. 28. Book _id as a Foreign Keypublisher = { name: "O’Reilly Media", founded: "1980", location: "CA" books: [ "123456789", ... ]}book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}
    29. 29. Where do you put the foreignKey?• Array of books inside of publisher – Makes sense when many means a handful of items – Useful when items have bound on potential growth• Reference to single publisher on books – Useful when items have unbounded growth (unlimited # of books)• SQL doesn’t give you a choice, no arrays
    30. 30. Another ExampleOne to Many Relations
    31. 31. Books and Patrons• Book can be checked out by one Patron at a time• Patrons can check out many books (but not 1000s)
    32. 32. Modeling Checkoutspatron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }}book = { _id: "123456789" title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], ...}
    33. 33. Modeling Checkoutspatron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", checked_out: "2012-10-15" }, { _id: "987654321", checked_out: "2012-09-12" }, ... ]}
    34. 34. De-normalizationProvides data locality De-normalize for speed
    35. 35. Modeling Checkouts -- de-normalizedpatron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], checked_out: ISODate("2012-10-15") }, { _id: "987654321" title: "MongoDB: The Scaling Adventure", ... }, ... ]}
    36. 36. Referencing vs. Embedding• Embedding is a bit like pre-joined data• Document level ops are easy for server to handle• Embed when the “many” objects always appear with (viewed in the context of) their parents.• Reference when you need more flexibility
    37. 37. An ExampleSingle Table Inheritance
    38. 38. Single Table Inheritancebook = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), kind: ”loanable" locations: [ ... ] pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" }}
    39. 39. An ExampleMany to Many Relations
    40. 40. Books and Authorsbook = { title: "MongoDB: The Definitive Guide", authors = [ { _id: "kchodorow", name: ”Kristina Chodorow” }, { _id: "mdirolf", name: ”Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English"}author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York"}db.authors.find ( { _id : ‘kchodorow’ } )db.books.find( { authors.id : ‘kchodorow’ } )
    41. 41. An ExampleTrees
    42. 42. Modeling Trees• Parent Links - Each node is stored as a document - Contains the id of the parent• Child Links - Each node contains the id’s of the children - Can support graphs (multiple parents / child)
    43. 43. Parent Linksbook = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", category: "MongoDB"}category = { _id: MongoDB, parent: Databases }category = { _id: Databases, parent: Programming }category = { _id: Programming }
    44. 44. Child Linksbook = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English"}category = { _id: MongoDB, children: [ 123456789, … ] }category = { _id: Databases, children: [ MongoDB, Postgres }category = { _id: Programming, children: [ DB, Languages ] }
    45. 45. Array of Ancestorsbook = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", parent: ”MongoDB”, categories: [ Programming, Databases, MongoDB ]}book = { title: "MySQL: The Definitive Guide", authors: [ ”Michael Kofler" ], published_date: ISODate("2010-09-24"), pages: 320, language: "English", parent: ”MySQL", categories: [ Programming, Databases, MySQL ]}db.books.find( { categories : MongoDB } )
    46. 46. An ExampleQueues
    47. 47. Book Documentbook = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", available: 3}db.books.findAndModify({ query: { _id: 123456789, available: { "$gt": 0 } }, update: { $inc: { available: -1 } }})
    48. 48. #ConferenceHashTagThank YouSpeaker NameJob Title, 10gen

    ×