Schema Design with MongoDB<br />Dwight Merriman<br />CEO<br />10gen<br />
What is document-oriented?<br />JSON objects<br />Not relational<br />Not OODB<br />Database schema != program “schema”<br />
Terms<br /><ul><li>Row -> JSON document</li></ul>Tables -> collections<br />Indexes -> index<br />Join -> embedding and li...
Choose a schema that<br />  makes queries easy<br />  makes queries fast<br />  facilitates atomicity<br />  facilitates s...
Key question: embed vs. link<br />“Contains relationship” : embed<br />Embed = “pre-joined”<br />Links: client/server turn...
Map-Reduce<br />
{ _id : … }<br />Should be:<br />Unique	<br />Invariant	<br />Ideally, not reused (delete; insert)<br />ObjectID type ofte...
Trees<br />mongodb.org/display/DOCS/Trees+in+MongoDB<br />
Single “Table” Inheritance Works Well<br />> t.find()<br />{ type:’irregular-shape’, area:99 }<br />{ type:’circle’, area:...
(1) Full tree in one document<br />{ <br />  comments: [ <br />   {by: "mathias", text: "...", <br />    replies: []} <br ...
Single document to fetch per page
One location on disk for whole tree
You can see full structure easily
Cons:
Hard to search
Hard to get back partial results
4MB limit</li></li></ul><li>(2) Parent Links<br />> t = db.tree1;<br />> t.find()<br />{ "_id" : 1 }<br />{ "_id" : 2, "pa...
(3) Array of Ancestors<br />> t = db.mytree;<br />> t.find()<br />{ "_id" : "a" }<br />{ "_id" : "b", "ancestors" : [ "a" ...
Atomicity<br />Atomicity at the document level<br />$operators<br />Compare and swap<br />
Compare and Swap<br />> t=db.inventory<br />> s = t.findOne( {sku:'abc'} )<br />> --s.qty;<br />> t.update({_id:s._id, qty...
Compare and Swap<br />> t=db.inventory<br />> s = t.findOne( {sku:'abc'} )<br />> --s.qty;<br />> t.update({_id:s._id, qty...
Compare and Swap - Better<br />> t=db.inventory<br />> s = t.findOne( {sku:'abc'} )<br />> obj_old = Object.extend({}, s);...
Compare and Swap – versions<br />update( { _id : myid, ver : last_ver },<br />        { $set : { x : “abc”, y : 99 },<br /...
Upcoming SlideShare
Loading in …5
×

Schema design with MongoDB (Dwight Merriman)

13,455 views

Published on

Published in: Technology
1 Comment
36 Likes
Statistics
Notes
No Downloads
Views
Total views
13,455
On SlideShare
0
From Embeds
0
Number of Embeds
2,070
Actions
Shares
0
Downloads
355
Comments
1
Likes
36
Embeds 0
No embeds

No notes for slide

Schema design with MongoDB (Dwight Merriman)

  1. 1. Schema Design with MongoDB<br />Dwight Merriman<br />CEO<br />10gen<br />
  2. 2. What is document-oriented?<br />JSON objects<br />Not relational<br />Not OODB<br />Database schema != program “schema”<br />
  3. 3. Terms<br /><ul><li>Row -> JSON document</li></ul>Tables -> collections<br />Indexes -> index<br />Join -> embedding and linking<br />
  4. 4. Choose a schema that<br /> makes queries easy<br /> makes queries fast<br /> facilitates atomicity<br /> facilitates sharding<br />
  5. 5. Key question: embed vs. link<br />“Contains relationship” : embed<br />Embed = “pre-joined”<br />Links: client/server turnarounds<br />On a close call, embed. Use rich documents.<br />Note the 4MB object size limit<br />Arbitrary limit but pushes one towards good designs<br />
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11. Map-Reduce<br />
  12. 12. { _id : … }<br />Should be:<br />Unique <br />Invariant <br />Ideally, not reused (delete; insert)<br />ObjectID type often best for a sharded collection<br />
  13. 13. Trees<br />mongodb.org/display/DOCS/Trees+in+MongoDB<br />
  14. 14. Single “Table” Inheritance Works Well<br />> t.find()<br />{ type:’irregular-shape’, area:99 }<br />{ type:’circle’, area:3.14, radius:1 }<br />{ type:’square’, area:4, d:2 }<br />{ type:’rect’, area:8, x:2, y:4 }<br />> t.find( { radius : { $gt : 2.0 } } )<br />> t.ensureIndex( { radius : 1 } ) // fine<br />
  15. 15. (1) Full tree in one document<br />{ <br /> comments: [ <br /> {by: "mathias", text: "...", <br /> replies: []} <br /> {by: "eliot", text: "...", <br /> replies: [ <br /> {by: "mike", text: "...", replies: []}<br /> ]}<br /> ] <br />}<br /><ul><li>Pros:
  16. 16. Single document to fetch per page
  17. 17. One location on disk for whole tree
  18. 18. You can see full structure easily
  19. 19. Cons:
  20. 20. Hard to search
  21. 21. Hard to get back partial results
  22. 22. 4MB limit</li></li></ul><li>(2) Parent Links<br />> t = db.tree1;<br />> t.find()<br />{ "_id" : 1 }<br />{ "_id" : 2, "parent" : 1 }<br />{ "_id" : 3, "parent" : 1 }<br />{ "_id" : 4, "parent" : 2 }<br />{ "_id" : 5, "parent" : 4 }<br />{ "_id" : 6, "parent" : 4 }<br />> // find children of node 4<br />> t.ensureIndex({parent:1})<br />> t.find( {parent : 4 } )<br />{ "_id" : 5, "parent" : 4 }<br />{ "_id" : 6, "parent" : 4 }<br />// hard to get all descendants<br />
  23. 23. (3) Array of Ancestors<br />> t = db.mytree;<br />> t.find()<br />{ "_id" : "a" }<br />{ "_id" : "b", "ancestors" : [ "a" ], "parent" : "a" }<br />{ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }<br />{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }<br />{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }<br />{ "_id" : "f", "ancestors" : [ "a", "e" ], "parent" : "e" }<br />{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }<br />> t.ensureIndex( { ancestors : 1 } )<br />> // find all descendents of b:<br />> t.find( { ancestors : 'b' })<br />{ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }<br />{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }<br />{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }<br />> // get all ancestors of f:<br />> anc = db.mytree.findOne({_id:'f'}).ancestors<br />[ "a", "e" ]<br />> db.mytree.find( { _id : { $in : anc } } )<br />{ "_id" : "a" }<br />{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }<br />
  24. 24. Atomicity<br />Atomicity at the document level<br />$operators<br />Compare and swap<br />
  25. 25. Compare and Swap<br />> t=db.inventory<br />> s = t.findOne( {sku:'abc'} )<br />> --s.qty;<br />> t.update({_id:s._id, qty:qty_old}, s);<br />> db.getLastError()<br />{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked <br />
  26. 26. Compare and Swap<br />> t=db.inventory<br />> s = t.findOne( {sku:'abc'} )<br />> --s.qty;<br />> t.update({_id:s._id, qty:qty_old}, s);<br />> db.getLastError()<br />{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked <br />Oops?<br />
  27. 27. Compare and Swap - Better<br />> t=db.inventory<br />> s = t.findOne( {sku:'abc'} )<br />> obj_old = Object.extend({}, s);<br />> --s.qty;<br />> // t.update({_id:s._id, qty:qty_old}, s);<br />> t.update( obj_old , s);<br />> print( db.getLastError().ok ? “worked” : “try again” );<br />
  28. 28. Compare and Swap – versions<br />update( { _id : myid, ver : last_ver },<br /> { $set : { x : “abc”, y : 99 },<br /> $inc : { ver : 1 }<br /> } )<br />
  29. 29. Of course, don’t both doing CASwhen you don’t have to<br />> t.update( { sku : “abc”,<br /> qty : {$gt:0} }, <br /> { $inc : { qty : -1 } } <br /> )<br /><ul><li>db.getLastError()</li></ul>{ "updatedExisting" : true , "n" : 1 , "ok" : 1 }<br />{ "updatedExisting" : false , "n" : 0 , "ok" : 1 }<br />
  30. 30. Compare and Swap<br />> t=db.inventory<br />> s = t.findOne( {sku:'abc'} )<br />> --s.qty;<br />> t.update({_id:s._id, qty:qty_old}, s);<br />> db.getLastError()<br />{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked <br />
  31. 31. Compare and Swap<br />> t=db.inventory<br />> s = t.findOne( {sku:'abc'} )<br />> --s.qty;<br />> t.update({_id:s._id, qty:qty_old}, s);<br />> db.getLastError()<br />{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked <br />
  32. 32. Sharding and Schemas<br />Shard key selection<br />Restrictions on unique indexes<br />Consider using may collections when that is natural<br /> 10 SN<br />
  33. 33. Other Considerations<br />Cappedcollections<br />
  34. 34. Questions?<br />Get involved with the MongoDB project!<br />Coding, drivers, frameworks, documentation, translation, consulting, evangelism, suggestions, vote on jira…spread the word.<br />

×