Schema Design in MongoDB - TriMug Meetup North Carolina

2,044 views

Published on

The Schema Design talk given at the TriMug meetup in Durham, NC

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,044
On SlideShare
0
From Embeds
0
Number of Embeds
1,270
Actions
Shares
0
Downloads
63
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Schema Design in MongoDB - TriMug Meetup North Carolina

  1. 1. Schema Design J. Randall Hunt Developer and Evangelist at MongoDB
  2. 2. Who am I? • J. Randall Hunt • @jrhunt • github.com/ranman • randall@mongodb.com
  3. 3. Why are you here? • To learn about MongoDB • To engage in the MongoDB community • To get free stuff
  4. 4. Levels of Abstraction!
  5. 5. ORMs To Save The Day!
  6. 6. Why change something that's been around for 40 years?
  7. 7. 10TB Data Human Kind Has Produced Until 1991
  8. 8. 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB 10TB Data Mankind Produces Every Day Since 2001
  9. 9. NOSQL/NOREL
  10. 10. Relational Schema Design Focus on data storage
  11. 11. Document Schema Design Focus on data use
  12. 12. Why MongoDB? • Focus on commodity hardware, not insane machines • Document Store • Dynamic Schema • Sensible Defaults • Modern Scaling Infrastructure
  13. 13. How People Use MongoDB • Analytics • Risk Management • Caching Layer • Recommendation Engines • GIS
  14. 14. Nitty Gritty
  15. 15. RDBMS MongoDB Database Database Table Collection Row Document Index Index Join Embedded Document Foreign Key Reference
  16. 16. Documents?
  17. 17. { } "hello": "world"
  18. 18. { } ! "_id": ObjectId("51638f8332e9bc556fe86de7"), "dstats": [ { "+": "5", "-": "0", "f": "gitstreamer.py" }, { "+": "3", "-": "3", "f": "post-commit.py" } ], "author": "ranman", "ts": ISODate("2013-04-08T19:48:11-0400"), "project": "gitstreamer", "msg": "turning this into a webapp"
  19. 19. CRUD
  20. 20. test> db.test.find() Fetched 0 record(s) in 1ms -- Index[none]
  21. 21. test> db.test.find() Fetched 0 record(s) in 1ms -- Index[none] test> db.test.insert({'hello': 'world'}) Inserted 1 record(s) in 1ms Insert WriteResult({ "ok": 1, "n": 1 })
  22. 22. test> db.test.find() Fetched 0 record(s) in 1ms -- Index[none] test> db.test.insert({'hello': 'world'}) Inserted 1 record(s) in 1ms Insert WriteResult({ "ok": 1, "n": 1 }) test> db.test.find({'hello': 'world'}) { "_id": ObjectId("52d61af21486ef9e06d6d41a"), "hello": "world" } Fetched 1 record(s) in 0ms -- Index[none]
  23. 23. test> db.test.find() Fetched 0 record(s) in 1ms -- Index[none] test> db.test.insert({'hello': 'world'}) Inserted 1 record(s) in 1ms Insert WriteResult({ "ok": 1, "n": 1 }) test> db.test.find({'hello': 'world'}) { "_id": ObjectId("52d61af21486ef9e06d6d41a"), "hello": "world" } Fetched 1 record(s) in 0ms -- Index[none] test> db.test.update({'hello': 'world'}, {$set: {'hello': 'welt'}}) Updated 1 existing record(s) in 0ms Update WriteResult({ "ok": 1, "n": 1 })
  24. 24. Lots of Operators!
  25. 25. Enough already I know what MongoDB is! Teach me schema design!
  26. 26. Library Management • Patrons • Books • Authors • Publishers
  27. 27. One To One Relations
  28. 28. patron = { _id: ObjectId("52d7173817d8bbd9564613cd"), name: 'Joe Schmoe' } ! address = { patron_id: ObjectId("52d7173817d8bbd9564613cd"), street: "100 Five Bridge Rd", city: "Clinton", state: "NC", zip: 28723 } >patron = db.patron.find({'name': 'Joe Schmoe'})[0] >db.address.find('patron_id': patron._id)
  29. 29. patron = { _id: ObjectId("52d7173817d8bbd9564613cd"), name: 'Joe Schmoe', address: { street: "100 Five Bridge Rd", city: "Clinton", state: "NC", zip: 28723 } } >db.patrons.findOne({'name': /Joe Schmoe/})
  30. 30. One To Many Relations
  31. 31. book = { _id: ObjectID(...), title: "MongoDB", authors: ['Kristina Chodorow', 'Mike Dirolf'], published_date: ISODate('2010-09-24'), pages: 216, language: 'English', publisher: { name: "O'Reilly Media", founded: "1980", location: "CA" } }
  32. 32. 4 ways of modeling one-to-many (there are more) • Embed the publisher • Use publisher as the "foreign key" • Use book as the "foreign key" • Hybrid
  33. 33. publisher = { _id: "oreilly", name: "O'Reilly Media", founded: "1980", location: "CA" } book = { _id: ObjectID(...), title: "MongoDB", authors: ['Kristina Chodorow', 'Mike Dirolf'], published_date: ISODate('2010-09-24'), pages: 216, language: 'English', publisher_id: 'oreilly' }
  34. 34. publisher = { name: "OReilly Media", founded: "1980", location: "CA" books: [ ObjectId(...), ... ] } book = { _id: ObjectID(...), title: "MongoDB", authors: ['Kristina Chodorow', 'Mike Dirolf'], published_date: ISODate('2010-09-24'), pages: 216, language: "English" }
  35. 35. Hybrid Models • We store the foreign key • and the info relevant to the relation
  36. 36. patron = { _id: ObjectId("52d7173817d8bbd9564613cd"), name: "Joe Bookreader", address: {...}, join_date: ISODate("2011-10-15"), books: [ {_id: ObjectID(...), title: "MongoDB", author: "Kristina C.", ...}, {_id: ObjectId(...), title: "Postgres", author: "Randall H.", ...} ] }
  37. 37. Where do you put the foreign key • Array of books inside of publisher • • • Makes sense when many means a handful of items Useful when items have bounds on potential growth Reference to a single publisher on each book • Useful when items have unbounded growth
  38. 38. Other Things To Model • Trees • Queues • Many-To-Many relationships
  39. 39. Thanks! @jrhunt

×