Schema Design
 

Schema Design

on

  • 1,269 views

 

Statistics

Views

Total Views
1,269
Views on SlideShare
652
Embed Views
617

Actions

Likes
4
Downloads
26
Comments
0

6 Embeds 617

http://www.10gen.com 452
http://www.mongodb.com 158
http://drupal1.10gen.cc 4
http://10gen.localhost 1
https://www.mongodb.com 1
https://comwww-drupal.10gen.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Schema Design Schema Design Presentation Transcript

  • #mongodbdays Schema Design Emily Stolfo Ruby Engineer/Evangelist, 10gen @EmStolfoTuesday, January 29, 13
  • Agenda • Working with documents • Common patterns • Evolving a Schema • Queries and IndexesTuesday, January 29, 13
  • RDBMS MongoDB Database ➜ Database Table ➜ Collection Row ➜ Document Index ➜ Index Join ➜ Embedded Document Foreign Key ➜ Reference TerminologyTuesday, January 29, 13
  • Working with DocumentsTuesday, January 29, 13
  • Documents Provide flexibility and performanceTuesday, January 29, 13
  • Example Schema (MongoDB)Tuesday, January 29, 13
  • Embedding Example Schema (MongoDB)Tuesday, January 29, 13
  • Embedding Linking Example Schema (MongoDB)Tuesday, January 29, 13
  • Relational Schema Design Focuses on data storageTuesday, January 29, 13
  • Document Schema Design Focuses on data useTuesday, January 29, 13
  • Schema Design Considerations • What is a priority? – High consistency – High read performance – High write performance • How does the application access and manipulate data? – Read/Write Ratio – Types of Queries / Updates – Data life-cycle and growth – Analytics (Map Reduce, Aggregation)Tuesday, January 29, 13
  • Tools for Data Access • Flexible Schemas • Embedded data structures • Secondary Indexes • Multi-Key Indexes • Aggregation Framework – Pipeline operators: $project, $match, $limit, $skip, $sort, $group, $unwind • No JoinsTuesday, January 29, 13
  • Data Manipulation • Conditional Query Operators – Scalar: $ne, $mod, $exists, $type, $lt, $lte, $gt, $gte, $ne – Vector: $in, $nin, $all, $size • Atomic Update Operators – Scalar: $inc, $set, $unset – Vector: $push, $pop, $pull, $pushAll, $pullAll, $addToSetTuesday, January 29, 13
  • Schema Design ExampleTuesday, January 29, 13
  • Library Management Application • Patrons • Books • Authors • PublishersTuesday, January 29, 13
  • One to One Relations exampleTuesday, January 29, 13
  • Modeling Patrons patron = { patron = { _id: "joe" _id: "joe" name: "Joe Bookreader” name: "Joe Bookreader", } address: { address = { street: "123 Fake St. ", patron_id = "joe", city: "Faketon", street: "123 Fake St. ", state: "MA", city: "Faketon", zip: 12345 state: "MA", zip: 12345 } } }Tuesday, January 29, 13
  • One to One Relations • “Contains” relationships are often embedded. • Document provides a holistic representation of objects with embedded entities. • Optimized read performance.Tuesday, January 29, 13
  • One To Many Relations examplesTuesday, January 29, 13
  • Patrons with many addresses patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), addresses: [ {street: "1 Vernon St.", city: "Newton", state: "MA", …}, {street: "52 Main St.", city: "Boston", state: "MA", …}, ] }Tuesday, January 29, 13
  • One to Many Relations example 2 Publishers and BooksTuesday, January 29, 13
  • Publishers and Books relation • Publishers put out many books • Books have one publisherTuesday, January 29, 13
  • Book Data MongoDB: The Definitive Guide, By Kristina Chodorow and Mike Dirolf Published: 9/24/2010 Pages: 216 Language: English Publisher: O’Reilly Media, CATuesday, January 29, 13
  • Book Model with Embedded Publisher book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" } }Tuesday, January 29, 13
  • Book Model with Normalized Publisher publisher = { name: "O’Reilly Media", founded: "1980", location: "CA" } book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" }Tuesday, January 29, 13
  • Link with Publisher _id as a Reference publisher = { _id: "oreilly", name: "O’Reilly Media", founded: "1980", location: "CA" } book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: "oreilly" }Tuesday, January 29, 13
  • Link with Book _ids as a Reference publisher = { name: "O’Reilly Media", founded: "1980", location: "CA" books: [ "123456789", ... ] } book = { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" }Tuesday, January 29, 13
  • Where do you put the reference? • Reference to single publisher on books – Use when items have unbounded growth (unlimited # of books) • Array of books in publisher document – Optimal when many means a handful of items – Use when there is a bound on potential growthTuesday, January 29, 13
  • One to Many Relations example 3 Books and PatronsTuesday, January 29, 13
  • Books and Patrons • Book can be checked out by one Patron at a time • Patrons can check out many books (but not 1000s)Tuesday, January 29, 13
  • Modeling Checkouts patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... } } book = { _id: "123456789" title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], ... }Tuesday, January 29, 13
  • Modeling Checkouts patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", checked_out: "2012-10-15" }, { _id: "987654321", checked_out: "2012-09-12" }, ... ] }Tuesday, January 29, 13
  • De-normalization Provides data localityTuesday, January 29, 13
  • Modeling Checkouts - de-normalized patron = { _id: "joe" name: "Joe Bookreader", join_date: ISODate("2011-10-15"), address: { ... }, checked_out: [ { _id: "123456789", title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], checked_out: ISODate("2012-10-15") }, { _id: "987654321" title: "MongoDB: The Scaling Adventure", ... }, ... ] }Tuesday, January 29, 13
  • Referencing vs. Embedding • Embedding is a bit like pre-joining data • Document level operations are easy for the server to handle • Embed when the “many” objects always appear with (viewed in the context of) their parents. • Reference when you need more flexibility How does your application access and manipulate data?Tuesday, January 29, 13
  • Many to Many Relations exampleTuesday, January 29, 13
  • Books and Authors book = { title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York" } author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany" }Tuesday, January 29, 13
  • Relation stored in Book document book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York" } author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany" }Tuesday, January 29, 13
  • Relation stored in Author document book = { _id: 123456789 title: "MongoDB: The Definitive Guide", published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "Cincinnati", books: [ {book_id: 123456789, title : "MongoDB: The Definitive Guide" }] }Tuesday, January 29, 13
  • Relation stored in both documents book = { _id: 123456789 title: "MongoDB: The Definitive Guide", authors = [ kchodorow, mdirolf ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York", books: [ 123456789, ... ] } author = { _id: "mdirolf", name: "Mike Dirolf", hometown: "Albany", books: [ 123456789, ... ] }Tuesday, January 29, 13
  • Where do you put the reference? Think about common queries book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York" } db.books.find( { authors.name : "Kristina Chodorow" } )Tuesday, January 29, 13
  • Where do you put the reference? Think about indexes book = { title: "MongoDB: The Definitive Guide", authors : [ { _id: "kchodorow", name: "Kristina Chodorow” }, { _id: "mdirolf", name: "Mike Dirolf” } ] published_date: ISODate("2010-09-24"), pages: 216, language: "English" } author = { _id: "kchodorow", name: "Kristina Chodorow", hometown: "New York" } db.books.createIndex( { authors.name : 1 } )Tuesday, January 29, 13
  • Trees exampleTuesday, January 29, 13
  • Parent References book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", category: "MongoDB" } category = { _id: MongoDB, parent: Databases } category = { _id: Databases, parent: Programming }Tuesday, January 29, 13
  • Child References book = { _id: 123456789, title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English" } category = { _id: MongoDB, children: [ 123456789, … ] } category = { _id: Databases, children: [ MongoDB, Postgres } category = { _id: Programming, children: [ Databases, Languages ] }Tuesday, January 29, 13
  • Modeling Trees • Parent References - Each node is stored as a document - Contains the id of the parent • Child References - Each node contains ids of its children - Can support graphs (multiple parents / child)Tuesday, January 29, 13
  • Array of Ancestors book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", categories: [ Programming, Databases, MongoDB ] } book = { title: "MySQL: The Definitive Guide", authors: [ "Michael Kofler" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", parent: "MySQL", ancestors: [ Programming, Databases, MySQL ] }Tuesday, January 29, 13
  • Single Table Inheritance exampleTuesday, January 29, 13
  • Single Table Inheritance book = { title: "MongoDB: The Definitive Guide", authors: [ "Kristina Chodorow", "Mike Dirolf" ] published_date: ISODate("2010-09-24"), kind: loanable locations: [ ... ] pages: 216, language: "English", publisher: { name: "O’Reilly Media", founded: "1980", location: "CA" } }Tuesday, January 29, 13
  • Queues exampleTuesday, January 29, 13
  • Update highest priority request db.loans.insert({ _id: 123456789, book_id: 987654321, pending: false, approved: false, priority: 3 }) //Find the highest priority request and mark as pending approval request = db.loans.findAndModify({ query: { pending: false, book_id: 987654321 }, sort: { priority: -1}, update: { $set: { pending : true, started: new ISODate() } } })Tuesday, January 29, 13
  • Summary • Schema design is different in MongoDB • Basic data design principals apply • Focus on how application accesses and manipulates data • Evolve schema to meet changing requirements • Application-level logic is important!Tuesday, January 29, 13
  • #mongodbdays Thank You Emily Stolfo Ruby Engineer/Evangelist, 10gen @EmStolfoTuesday, January 29, 13