Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Schema Design By ExampleEmily Stolfo - twitter: @EmStolfo
Differences with Traditional Schema DesignTraditional                MongoDBYour application doesnt   Its always about you...
Schema Design is Evolutionary   Design and Development.   Deployment and Monitoring.   Iterative Modification.
3 Components to Schema Design 1. The Data Your Application Needs. 2. How Your Application Will Read the Data. 3. How Your ...
What is a Document Database?   Storage in json.   Application-defined schema   Opportunity for choice
Schema - Author  author = {    "_id": int,    "first_name": string,    "last_name": string  }
Schema - User  user = {    "_id": int,    "username": string,    "password": string  }
Schema - Book  book = {    "_id": int,    "title": string,    "author": int,    "isbn": string,    "slug": string,    "pub...
Example: Library ApplicationAn application for saving a collection of books.
Some Sample Documents  > db.authors.findOne()  {    _id: 1,    first_name: "F. Scott",    last_name: "Fitgerald"  }  > db....
> db.books.findOne(){  _id: 1,  title: "The Great Gatsby",  slug: "9781857150193-the-great-gatsby",  author: 1,  available...
Design and Development.
The Data Our Library Needs   Embedded Documents      book.publisher:      {        publisher_name: "Everyman’s Library",  ...
How Our Library Reads Data.Query for all the books by a specific author     > author = db.authors.findOne({first_name: "F....
Query for books by title     > db.books.find({title: "The Great Gatsby"})     {       ...     }
Query for books in which I have made notes     > db.books.find({notes.user: 1})     {       ...     }     {       ...     }
How Our Library Writes Data.Add notes to a book     > note = { user: 1, note: "I did NOT like this book." }     > db.books...
Take Advantage of Atomic Operations     $set - set a value     $unset - unsets a value     $inc - increment an integer    ...
Deployment and Monitoring
How Our Library Reads DataQuery for books by an authors first name     > authors = db.authors.find( { first_name: /^f.*/i ...
Documents Should Reflect Query Patterns     Partially Embedded Documents          book = {            "_id": int,         ...
How Our Library Reads DataQuery for a books notes and get user info     > db.books.find({title: "The Great Gatsby"}, {_not...
Take Advantage of Immutable Data     Username is the natural key and is immutable         user = {           "_id": string...
How Our Library Writes DataAdd notes to a book     > note = { user: "craig.wilson@10gen.com", note: "I did NOT li     ke t...
Anti-pattern: Continually Growing Documents     Document Size Limit     Storage Fragmentation
Linking: One-to-One relationship     Move notes to another document, with one document per     book id.          book = { ...
Linking: One-to-Many relationship     Move notes to another document with one document per note.          book = {        ...
Bucketing     bookNotes contains a maximum number of documents            bookNotes = {              "_id": int,          ...
Interative Modification
The Data Our Library NeedsUsers want to comment on other peoples notes    bookNotes = {      "_id": int,      "book": int,...
What benefits does this provide?     Single document for all comments on a note.     Single location on disk for the whole...
Alternative solution:Store arrays of ancestors     > t = db.mytree;     >   t.find()     {   _id: "a" }     {   _id: "b", ...
General notes for read-heavy schemasThink about indexes and sharding     Multi-key indexes.     Secondary indexes.     Sha...
Schema design in mongoDBIts about your application.Its about your data and how its used.Its about the entire lifetime of y...
Schema Design By Example
Schema Design By Example
Upcoming SlideShare
Loading in …5
×

Schema Design By Example

80,109 views

Published on

One of the challenges that comes with moving to MongoDB is figuring how to best model your data. While most developers have internalized the rules of thumb for designing schemas for RDBMSs, these rules don't always apply to MongoDB. The simple fact that documents can represent rich, schema-free data structures means that we have a lot of viable alternatives to the standard, normalized, relational model. Not only that, MongoDB has several unique features, such as atomic updates and indexed array keys, that greatly influence the kinds of schemas that make sense. Understandably, this begets good questions...

  • Nice !! Download 100 % Free Ebooks, PPts, Study Notes, Novels, etc @ https://www.ThesisScientist.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • For data visualization,data analytics,data intelligence and ERP Tools, online training with job placements, register at http://www.todaycourses.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Schema Design By Example

  1. 1. Schema Design By ExampleEmily Stolfo - twitter: @EmStolfo
  2. 2. Differences with Traditional Schema DesignTraditional MongoDBYour application doesnt Its always about your application.matter.Only About the Data Not Only About the Data, but also how its used.Only relevant during Relevant during the lifetime of yourdesign. application.
  3. 3. Schema Design is Evolutionary Design and Development. Deployment and Monitoring. Iterative Modification.
  4. 4. 3 Components to Schema Design 1. The Data Your Application Needs. 2. How Your Application Will Read the Data. 3. How Your Application Will Write the Data.
  5. 5. What is a Document Database? Storage in json. Application-defined schema Opportunity for choice
  6. 6. Schema - Author author = { "_id": int, "first_name": string, "last_name": string }
  7. 7. Schema - User user = { "_id": int, "username": string, "password": string }
  8. 8. Schema - Book book = { "_id": int, "title": string, "author": int, "isbn": string, "slug": string, "publisher": { "name": string, "date": timestamp, "city": string }, "available": boolean, "pages": int, "summary": string, "subjects": [string, string] "notes": [{ "user": int, "note": string }], "language": string }
  9. 9. Example: Library ApplicationAn application for saving a collection of books.
  10. 10. Some Sample Documents > db.authors.findOne() { _id: 1, first_name: "F. Scott", last_name: "Fitgerald" } > db.users.findOne() { _id: 1, username: "craig.wilson@10gen.com", password: "slsjfk4odk84k209dlkdj90009283d" }
  11. 11. > db.books.findOne(){ _id: 1, title: "The Great Gatsby", slug: "9781857150193-the-great-gatsby", author: 1, available: true, isbn: "9781857150193", pages: 176, publisher: { publisher_name: "Everyman’s Library", date: ISODate("1991-09-19T00:00:00Z"), publisher_city: "London" }, summary: "Set in the post-Great War...", subjects: ["Love stories", "1920s", "Jazz Age"], notes: [ { user: 1, note: "One of the best..."}, { user: 2, note: "It’s hard to..."} ], language: "English"}
  12. 12. Design and Development.
  13. 13. The Data Our Library Needs Embedded Documents book.publisher: { publisher_name: "Everyman’s Library", date: ISODate("1991-09-19T00:00:00Z"), city: "London" } Embedded Arrays book.subjects: ["Love stories", "1920s", "Jazz Age"], Embedded Arrays of Documents book.notes: [ { user: 1, note: "One of the best..."}, { user: 2, note: "It’s hard to..."} ],
  14. 14. How Our Library Reads Data.Query for all the books by a specific author > author = db.authors.findOne({first_name: "F. Scott", last_na me: "Fitzgerald"}); > db.books.find({author: author._id}) { ... } { ... }
  15. 15. Query for books by title > db.books.find({title: "The Great Gatsby"}) { ... }
  16. 16. Query for books in which I have made notes > db.books.find({notes.user: 1}) { ... } { ... }
  17. 17. How Our Library Writes Data.Add notes to a book > note = { user: 1, note: "I did NOT like this book." } > db.books.update({ _id: 1 }, { $push: { notes: note }})
  18. 18. Take Advantage of Atomic Operations $set - set a value $unset - unsets a value $inc - increment an integer $push - append a value to an array $pushAll - append several values to an array $pull - remove a value from an array $pullAll - remove several values from an array $bit - bitwise operations $addToSet - adds a value to a set if it doesnt already exist $rename - renames a field
  19. 19. Deployment and Monitoring
  20. 20. How Our Library Reads DataQuery for books by an authors first name > authors = db.authors.find( { first_name: /^f.*/i }, {_id: 1} ) > authorIds = authors.map(function(x) { return x._id; }) > db.books.find({author: { $in: authorIds } }) { ... } { ... }
  21. 21. Documents Should Reflect Query Patterns Partially Embedded Documents book = { "_id": int, "title": string, "author": { "author": int, "name": string }, ... } Query for books by an authors first name using an embedded document with a denormalized name. > db.books.find({author.name: /^f.*/i }) { ... } { ... }
  22. 22. How Our Library Reads DataQuery for a books notes and get user info > db.books.find({title: "The Great Gatsby"}, {_notes: 1}) { _id: 1, notes: [ { user: 1, note: "One of the best..."}, { user: 2, note: "It’s hard to..."} ] }
  23. 23. Take Advantage of Immutable Data Username is the natural key and is immutable user = { "_id": string, "password": string } book = { //... "notes": [{ "user": string, "note": string }], } Query for a books notes > db.books.find({title: "The Great Gatsby"}, {notes: 1}) { _id: 1, notes: [ { user: "craig.wilson@10gen.com", note: "One of the b est..."}, { user: "jmcjack@mcjack.com", note: "It’s hard to..." } ] }
  24. 24. How Our Library Writes DataAdd notes to a book > note = { user: "craig.wilson@10gen.com", note: "I did NOT li ke this book." } > db.books.update({ _id: 1 }, { $push: { notes: note }})
  25. 25. Anti-pattern: Continually Growing Documents Document Size Limit Storage Fragmentation
  26. 26. Linking: One-to-One relationship Move notes to another document, with one document per book id. book = { "_id": int, ... // remove the notes field } bookNotes = { "_id": int, // this will be the same as the book id... "notes": [{ "user": string, "note": string }] } Keeps the books document size consistent. Queries for books dont return all the notes. Still suffers from a continually growing bookNotes document.
  27. 27. Linking: One-to-Many relationship Move notes to another document with one document per note. book = { "_id": int, ... // remove the notes field } bookNotes = { "_id": int, "book": int, // this will be the same as the book id... "date": timestamp, "user": string, "note": string } Keeps the book document size consistent. Queries for books dont return all the notes. Notes not necessarily near each other on disk. Possibly slow reads.
  28. 28. Bucketing bookNotes contains a maximum number of documents bookNotes = { "_id": int, "book": int, // this is the book id "note_count": int, "last_changed": datetime, "notes": [{ "user": string, "note": string }] } Still use atomic operations to update or create a document > note = { user: "craig.wilson@10gen.com", note: "I did N OT like this book." }> db.bookNotes.update({ { book: 1, note_count { $lt: 10 } }, { $inc: { note_count: 1 }, $push { notes: note }, $set: { last_changed: new Date() } }, true // upsert })
  29. 29. Interative Modification
  30. 30. The Data Our Library NeedsUsers want to comment on other peoples notes bookNotes = { "_id": int, "book": int, // this is the book id "note_count": int, "last_changed": datetime, "notes": [{ "user": string, "note": string, "comments": [{ "user": string, "text": string, "replies": [{ "user": string, "text": string, "replies": [{...}] }] }] }] }
  31. 31. What benefits does this provide? Single document for all comments on a note. Single location on disk for the whole tree. Legible tree structure.What are the drawbacks of storing a tree in this way? Difficult to search. Difficult to get back partial results. Document can get large very quickly.
  32. 32. Alternative solution:Store arrays of ancestors > t = db.mytree; > t.find() { _id: "a" } { _id: "b", ancestors: [ "a" ], parent: "a" } { _id: "c", ancestors: [ "a", "b" ], parent: "b" } { _id: "d", ancestors: [ "a", "b" ], parent: "b" } { _id: "e", ancestors: [ "a" ], parent: "a" } { _id: "f", ancestors: [ "a", "e" ], parent: "e" } { _id: "g", ancestors: [ "a", "b", "d" ], parent: "d" }How would you do it?
  33. 33. General notes for read-heavy schemasThink about indexes and sharding Multi-key indexes. Secondary indexes. Shard key choice.
  34. 34. Schema design in mongoDBIts about your application.Its about your data and how its used.Its about the entire lifetime of your application.

×